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Streptococcus pneumoniae Polynucleotides and Sequences 

FIELD OF THE INVENTION 

5 The present invention relates to the field of molecular biology. In 

particular, it relates to, among other things, nucleotide sequences of Streptococcus 
pneumoniae, contigs, ORFs, fragments, probes, primers and related 
polynucleotides thereof, peptides and polypeptides encoded by the sequences, and 
uses of the polynucleotides and sequences thereof, such as in fermentation. 

10 polypeptide production, assays and pharmaceutical development, among others. 

BACKGROUND OF THE INVENTION 

« 

Streptococcus pneumoniae has been one of the most extensively studied 

15 microorganisms since its first isolation in 1881. It was the object of many 
investigations that led to important scientific discoveries. In 1928, Griffith 
observed that when heat-killed encapsulated pneumococci and live strains 
constitutively lacking any capsule were concomitantly injected into mice, the 
nonencapsulated could be converted into encapsulated pneumococci with the same 

20 capsular type as the heat-killed strain. Years later, the nature of this "transforming 
principle," or carrier of genetic information, was shown to be DNA. (Avery, O.T. t 
era/.,/. Exp. Med., 79:137-157 (1944)). 

In spite of the vast number of publications on S. pneumoniae many 
questions about its virulence are still unanswered, and this pathogen remains a 

25 major causative agent of serious human disease, especially community-acquired 
pneumonia. (Johnston, R.B., et a/., Rev. Infect. Dis. 7J(Suppl. 6):S509-517 
(1991)). In addition, in developing countries, the pneumococcus is responsible for 
the death of a large number of children under the age of 5 years from pneumococcal 
pneumonia. The incidence of pneumococcal disease is highest in infants under 2 

30 years of age and in people over 60 years of age. Pneumococci are the second most 
frequent cause (after Haemophilus influenzae type b) of bacterial meningitis and 
otitis media in children. With the recent introduction of conjugate vaccines for H. 
influenzae type b, pneumococcal meningitis is likely to become increasingly 
prominent. S. pneumoniae is the most important etiologic agent of community- 
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acquired pneumonia in adults and is the second most common cause of bacterial 
meningitis behind Neisseria meningitidis. 

The antibiotic generally prescribed to treat 5. pneumoniae is 
benzylpenicillin, although resistance to this and to other antibiotics is found 

5 occasionally. Pneumococcal resistance to penicillin results from mutations in its 
penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused by 
a sensitive strain, treatment with penicillin is usually successful unless started too 
late. Erythromycin or clindamycin can be used to treat pneumonia in patients 
hypersensitive to penicillin, but resistant strains to these drugs exist. Broad 

10 spectrum antibiotics (e.g., the tetracyclines) may also be effective, although 
tetracycline-resistant strains are not rare. In spite of the availability of antibiotics, 
the mortality of pneumococcal bacteremia in the last four decades has remained 
stable between 25 and 29%. (Gillespie, S.H. T et aL J. Med. Microbiol 25:237- 
248 (1989). 

15 5. pneumoniae is carried in the upper respiratory tract by many healthy 

individuals. It has been suggested that attachment of pneumococci is mediated by a 
disaccharide receptor on fibronectin. present on human pharyngeal epithelial cells. 
(Anderson, B.J., etaLJ. Immunol. 742:2464-2468 (1989). The mechanisms by 
which pneumococci translocate from the nasopharynx to the lung, thereby causing 

20 pneumonia, or migrate to the blood, giving rise to bacteremia or septicemia, are 
poorly understood. (Johnston, R.B., et aL Rev. Infect Dis. I J(Suppl. 6):S509- 
517(1991). 

Various proteins have been suggested to be involved in the pathogenicity of 
S. pneumoniae, however, only a few of them have actually been confirmed as 

25 virulence factors. Pneumococci produce an IgAl protease that might interfere with 
host defense at mucosal surfaces. (Kornfield, S.J., etal. y Rev. Inf. Dis. 5:521- 
534 (1981). S. pneumoniae also produces neuraminidase, an enzyme that may 
facilitate attachment to epithelial cells by cleaving sialic acid from the host 
glycolipids and gangliosides. Partially purified neuraminidase was observed to 

30 induce meningitis-like symptoms in mice; however, the reliability of this finding 
has been questioned because the neuraminidase preparations used were probably 
contaminated with cell wall products. Other pneumococcal proteins besides 
neuraminidase are involved in the adhesion of pneumococci to epithelial and 
endothelial cells. These pneumococcal proteins have as yet not been identified. 

35 Recently, Cundell et. aL, reported that peptide permeases can modulate 
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pneumococcal adherence to epithelial and endothelial cells. It was. however, 
unclear whether these permeases function directly as adhesions or whether they 
enhance adherence by modulating the expression of pneumococcal adhesions. 
(DeVelasco, E.A., et ai t Micro. Rev. 59:59 1 -603 ( 1 995). A better understanding 

5 of the virulence factors determining its pathogenicity will need to be developed to 
cope with the devastating effects of pneumococcal disease in humans. 

Ironically, despite the prominent role of 5. pneumoniae in the discovery of 
DNA, little is known about the molecular genetics of the organism. The S. 
pneumoniae genome consists of one circular* covalently closed, double-stranded 

10 DNA and a collection of so-called variable accessory elements, such as prophages, 
plasinids, transposons and the like. Most physical characteristics and almost all of 
the genes of 5. pneumoniae are unknown. Among the few that have been 
identified, most have not been physically mapped or characterized in detail Only a 
few genes of this organism have been sequenced. (See, for instance current 

15 versions of GENBANK and other nucleic acid databases, and references that relate 
to the genome of 5. pneumoniae such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by 5, 
pneumoniae, infection involves the programmed expression of S. pneumoniae 
genes, and that characterizing the genes and their patterns of expression would add 

20 dramatically to our understanding of the organism and its host interactions. 
Knowledge of 5. pneumoniae genes and genomic organization would improve our 
understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, 
characterized genes and genomic fragments of S. pneumoniae would provide 

25 reagents for, among other things, detecting, characterizing and controlling S. 
pneumoniae infections. There is a need to characterize the genome of 5. 
pneumoniae and for polynucleotides of this organism. 
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SUMMARY OF THE INVENTION 

The present invention is based on the sequencing of fragments of the 
5 Streptococcus pneumoniae genome. The primary nucleotide sequences which were 
generated are provided in SEQ ID NOS: 1-391. 

The present invention provides the nucleotide sequence of several hundred 
contigsof the Streptococcus pneumoniae genome, which are listed in tables below 
and set out in the Sequence Listing submitted herewith, and representative 

if) fragments thereof, in a form which can be readily used, analyzed, and interpreted 
by a skilled artisan. In one embodiment, the present invention is provided as 
contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1-39 1. 

The present invention further provides nucleotide sequences which are at 

1 5 least 95% identical to the nucleotide sequences of SEQ ID NOS: 1 -39 1 . 

The nucleotide sequence of SEQ ID NOS: 1-391, a representative fragment 
thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide 
sequence of SEQ ID NOS: 1-39 1 may be provided in a variety of mediums to 
facilitate its use. In one application of this embodiment, the sequences of the 

20 present invention are recorded on computer readable media. Such media includes, 
but is not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape: optical storage media such as CD-ROM: 
electrical storage media such as RAM and ROM: and hybrids of these categories 
such as magnetic/optical storage media. 

25 The present invention further provides systems* particularly computer- 

based systems which contain the sequence information herein described stored in a 
data storage means. Such systems are designed to identify commercially important 
fragments of the Streptococcus pneumoniae genome. 

Another embodiment of the present invention is directed to fragments of the 

30 Streptococcus pneumoniae genome having particular structural or functional 
attributes. Such fragments of the Streptococcus pneumoniae genome of the present 
invention include, but are not limited to, fragments which encode peptides, 
hereinafter referred to as open reading frames or ORFs, fragments which modulate 
the expression of an operably linked ORF, hereinafter referred to as expression 

35 modulating fragments or EMFs, and fragments which can be used to diagnose the 
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presence of Streptococcus pneumoniae in a sample, hereinafter referred to as 
diagnostic fragments or DFs. 

Each of the ORFs in fragments of the Streptococcus pneumoniae genome 
disclosed in Tables 1-3, and the EMFs found 5* to the ORFs, can be used in 
5 numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the 
presence of a specific microbe in a sample, to selectively control gene expression in 
a host and in the production of polypeptides, such as polypeptides encoded by 
ORFs of the present invention, particular those polypeptides that have a 

10 pharmacological activity. 

The present invention further includes recombinant constructs comprising 
one or more fragments of the Streptococcus pneumoniae genome of the present 
invention. The recombinant constructs of the present invention comprise vectors, 
such as a plasmid or viral vector, into which a fragment of the Streptococcus 

1 5 pneumoniae has been inserted. 

The present invention further provides host cells containing any of the 
isolated fragments of the Streptococcus pneumoniae genome of the present 
invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a 

20 bacterial cell. 

The present invention is further directed to isolated polypeptides and 
proteins encoded by ORFs of the present invention. A variety of methods, well 
known to those of skill in the art, routinely may be utilized to obtain any of the 
polypeptides and proteins of the present invention. For instance, polypeptides and 

25 proteins of the present invention having relatively short, simple amino acid 
sequences readily can be synthesized using commercially available automated 
peptide synthesizers. Polypeptides and proteins of the present invention also may 
be purified from bacterial cells which naturally produce the protein. Yet another 
alternative is to purify polypeptide and proteins of the present invention from cells 

30 which have been altered to express them. 

The invention further provides methods of obtaining homologs of the 
fragments of the Streptococcus pneumoniae genome of the present invention and 
homologs of the proteins encoded by the ORFs of the present invention. 
Specifically, by using the nucleotide and amino acid sequences disclosed herein as 
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a probe or as primers, and techniques such as -PGR cloning and colony/plaque 
hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind 
polypeptides and proteins of the present invention. Such antibodies include both 

5 monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above- 
described antibodies. A hybridoma is an immortalized cell line which is capable of 
secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples 

10 derived from cells which express one of the ORFs of the present invention, or a 
homolog thereof. Such methods comprise incubating a test sample with one or 
more of the antibodies of the present invention, or one or more of the DFs of the 
present invention, under conditions which allow a skilled artisan to determine if the 
sample contains the ORF or product produced therefrom. 

15 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the above-described assays. 

Specifically* the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the antibodies, or one of the DFs of the present invention; and 

20 (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of bound antibodies or hybridized 
DFs. 

Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents capable of binding to 

25 a polypeptide or protein encoded by one of the ORFs of the present invention. 
Specifically, such agents include, as further described below, antibodies, peptides, 
carbohydrates, pharmaceutical agents and the like. Such methods comprise steps 
of: (a) contacting an agent with an isolated protein encoded by one of the ORFs of 
the present invention; and (b) determining whether the agent binds to said protein. 

30 The present genomic sequences of Streptococcus pneumoniae will be of 

great value to all laboratories working with this organism and for a variety of 
commercial purposes. Many fragments of the Streptococcus pneumoniae genome 
will be immediately identified by similarity searches against GenBank or protein 
databases and will be of immediate value to Streptococcus pneumoniae researchers 
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and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic 
sequences of bacterial and other genomes has and will greatly enhance the ability to 
5 analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis 
of chromosome structure and function, including the ability to identify genes within 
large segments of genomic DNA, the structure, position, and spacing of regulatory 
elements, the identification of genes with potential industrial applications, and the 
10 ability to do comparative genomic and molecular phylogeny. 

DESCRIPTION OF THE FIGURES 

FIGURE 1 is a block diagram of a computer system (102) that can be 
15 used to implement computer-based systems of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer 
programs used to collect, assemble, edit and annotate the contigs of the 
Streptococcus pneumoniae genome of the present invention. Both Macintosh and 

20 Unix platforms are used to handle the AB 373 and 377 sequence data files, largely 
as aescribed in Kerlavage et a/., Proceedings of the Twenty-Sixth Annual Hawaii 
International Conference on System Sciences. 585, IEEE Computer Society Press, 
Washington D.C (1993). Factura (AB) is a Macintosh program designed for 
automatic vector sequence removal and end-trimming of sequence files. The 

25 program Loadis runs on a Macintosh platform and parses the feature data extracted 
from the sequence files by Factura to the Unix based Streptococcus pneumoniae 
relational database. Assembly of contigs (and whole genome sequences) is 
accomplished by retrieving a specific set of sequence files and their associated 
features using Extrseq, a Unix utility for retrieving sequences from an SQL 

30 database. The resulting sequence file is processed by seq_filter to trim portions of 
the sequences with more than 2% ambiguous nucleotides. The sequence files were 
assembled using TIGR Assembler, an assembly engine designed at The Institute 
for Genomic Research ( TIGR ) for rapid and accurate assembly of thousands of 
sequence fragments. The collection of contigs generated by the assembly step is 

35 loaded into the database with the lassie program. Identification of open reading 
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frames (ORFs) is accomplished by processing contigs with zorf or GenMark. The 
ORFs are searched against S. pneumoniae sequences from GenBank and against all 
protein sequences using the BLASTN and BLASTP programs* described in 
Altschul et ai, /. MoL BioL 215: 403-410 (1990)). Results of the ORF 
5 determination and similarity searching steps were loaded into the database. As 
described below, some results of the determination and the searches arc set out in 
Tables 1-3. 

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 

10 

The present invention is based on the sequencing of fragments of the 
Streptococcus pneumoniae genome and analysis of the sequences. The primary 
nucleotide sequences generated by sequencing the fragments are provided in SEQ 
ED NOS: 1-391. (As used herein, the "primary sequence'* refers to the nucleotide 

15 sequence represented by the IUPAC nomenclature system.) 

In addition to the aforementioned Streptococcus pneumoniae polynucleotide 
and polynucleotide sequences, the present invention provides the nucleotide 
sequences of SEQ ID NOS: 1-391, or representative fragments thereof, in a form 
which can be readily used, analyzed, and interpreted by a skilled artisan. 

20 As used herein, a "representative fragment of the nucleotide sequence 

depicted in SEQ ID NG3;i -39 i* refers to any portion of the SEQ ED NOS: 1-39 1 
which is not presently represented within a publicly available database. Preferred 
representative fragments of the present invention are Streptococcus pneumoniae 
open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and 

25 fragments which can be used to diagnose the presence of Streptococcus 
pneumoniae in sample ( DFs ). A non-limiting identification of preferred 
representative fragments is provided in Tables 1-3. As discussed in detail below, 
the information provided in SEQ DD NOS: 1-391 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled 

30 in the art to clone and sequence all "representative fragments" of interest, including 
open reading frames encoding a large variety of Streptococcus pneumoniae 
proteins. 

While the presently disclosed sequences of SEQ ID NOS: 1-39 1 are highly 
accurate, sequencing techniques are not perfect and, in relatively rare instances, 
35 further investigation of a fragment or sequence of the invention may reveal a 
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nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID 
NOS: 1-391. However, once the present invention is made available (i.e., once the 
information in SEQ ID NOS: 1-391 and Tables 1-3 has been made available), 
resolving a rare sequencing error in SEQ ID NOS: 1-391 will be well within the 

5 skill of the art. The present disclosure makes available sufficient sequence 
information to allow any of the described contigs or portions thereof to be obtained 
readily by straightforward application of routine techniques. Further sequencing of 
such polynucleotide may proceed in like manner using manual and automated 
sequencing methods which are employed ubiquitous in the art. Nucleotide 

10 sequence editing software is publicly available. For example. Applied Biosy stem's 
(AB) AutoAssembler can be used as an aid during visual inspection of nucleotide 
sequences. By employing such routine techniques potential errors readily may be 
identified and the correct sequence then may be ascertained by targeting further 
sequencing effort, also of a routine nature, to the region containing the potential 

15 error. 

Even if all of the very rare sequencing errors in SEQ ED NOS: 1-391 were 
corrected, the resulting nucleotide sequences would still be at least 95% identical, 
nearly all would be at least 99% identical, and the great majority would be at least 
99.9% identical to the nucleotide sequences of SEQ ID NOS: 1-391. 

20 As discussed elsewhere herein, polynucleotides of the present invention 

readily may be obtained by routine application of well known and standard 
procedures for cloning and sequencing DNA. Detailed methods for obtaining 
libraries and for sequencing are provided below, for instance. A wide variety of 
Streptococcus pneumoniae strains that can be used to prepare S. pneumoniae 

25 genomic DNA for cloning and for obtaining polynucleotides of the present 
invention are available to the public from recognized depository institutions, such 
as the American Type Culture Collection ( ATCC ). While the present invention is 
enabled by the sequences and other information herein disclosed, the S. 
pneumoniae strain that provided the DNA of the present Sequence Listing, Strain 

30 7/87 14.8.91, has been deposited in the ATCC, as a convenience to those of skill 
in the art. As a further convenience, a library of S. pneumoniae genomic DNA. 
derived from the same strain, also has been deposited in the ATCC. The 5. 
pneumoniae strain was deposited on October 10, 1996, and was given Deposit No. 
55840, and the cDNA library was deposited on October 11, 1996 and was given 

35 Deposit No. 97755. The genomic fragments in the library are 15 to 20 kb 
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fragments generated by partial Sau3Al digestion and they are inserted into the 
BamHI site in the well-known iambda-derived vector lambda DASH II (Stratagene, 
La Jolla, CA). The provision of the deposits is not a waiver of any rights of the 
inventors or their assignees in the present subject matter. 

5 The nucleotide sequences of the genomes from different strains of 

Streptococcus pneumoniae differ somewhat. However, the nucleotide sequences 
of the genomes of all Streptococcus pneumoniae strains will be at least 95% 
identical, in corresponding part, to the nucleotide sequences provided in SEQ ID 
NOS: 1-391. Nearly all will be at least 99% identical and the great majority will be 

10 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which 
are at least 95%, preferably 99% and most preferably 99.9% identical to the 
nucleotide sequences of SEQ ID NOS: 1-391, in a form which can be readily used, 
analyzed and interpreted by the skilled artisan. 

15 Methods for determining whether a nucleotide sequence is at least 95%, at 

least 99% or at least 99.9% identical to the nucleotide sequences of SEQ ID 
NOS: I -39 1 are routine and readily available to the skilled artisan. For example, the 
well known fasta algorithm described in Pearson and Lipman, Proc. NatL Acad. 
Set. USA 85: 2444 (1988) can be used to generate the percent identity of nucleotide 

:o sequences. The BLASTN program also can be used to generate an identity score 
of poiy nucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS: 1-391, a representative 
25 fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% 
and most preferably at least 99.9% identical to a polynucleotide sequence of SEQ 
ID NOS: 1-391 may be "provided" in a variety of mediums to facilitate use thereof. 
As used herein, provided refers to a manufacture, other than an isolated nucleic 
acid molecule, which contains a nucleotide sequence of the present invention; i.e., 
30 a nucleotide sequence provided in SEQ ID NOS: 1-391, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most 
preferably at least 99.9% identical to a polynucleotide of SEQ ID NOS: 1-391. 
Such a manufacture provides a large portion of the Streptococcus pneumoniae 
genome and parts thereof (e.g.. a Streptococcus pneumoniae open reading frame 
35 (ORF)) in a form which allows a skilled artisan to examine the manufacture using 
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means not directly applicable to examining the Streptococcus pneumoniae genome 
or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

5 readable media" refers to any medium which can-be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, 
such as floppy discs, hard disc storage medium, and magnetic tape; optical storage 
media such as CD- ROM; electrical storage media such as RAM and ROM; and 
hybrids of these categories, such as magnetic/optical storage media. A skilled 

10 artisan can readily appreciate how any of the presently known computer readable 
mediums can be used to create a manufacture comprising computer readable 
medium having recorded thereon a nucleotide sequence of the present invention. 
Likewise, it will be clear to those of skill how additional computer readable media 
that may be developed also can be used to create analogous manufactures having 

1 5 recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
know methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present 

20 invention. A variety of data storage structures are available to a skilled artisan 
for creating a computer readable medium having recorded thereon a nucleotide 
sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In 
addition, a variety of data processor programs and formats can be used to store the 

25 nucleotide sequence information of the present invention on computer readable 
medium. The sequence information can be represented in a word processing text 
file, formatted in commercially- available software such as WordPerfect and 
Microsoft Word, or represented in the form of an ASCII file, stored in a database 
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily 

30 adapt any number of data-processor structuring formats (e.g., text file or database) 
in order to obtain computer readable medium having recorded thereon the 
nucleotide sequence information of the present invention. 

Computer software is publicly available which allows a skilled artisan to 
access sequence information provided in a computer readable medium. Thus, by 

35 providing in computer readable form the nucleotide sequences of SEQ ED NOS:l- 
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391, a representative fragment thereof, or a nucleotide sequence at least 95%, 
preferably at least 99% and most preferably at least 99.9% identical to a sequence 
of SEQ ID NOS: 1-391 the present invention enables the skilled artisan routinely to 
access the provided sequence information for a wide variety of purposes. 
5 The examples which follow demonstrate how software which implements 

the BLAST (Altschul et a/., J, Mol Biol 2/5:403-410 (1990)) and BLAZE 
(Brutlag ex a/., Comp. Chem. 77:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Streptococcus 
pneumoniae genome which contain homology to ORFs or proteins from both 

10 Streptococcus pneumoniae and from other organisms. Among the ORFs discussed 
herein are protein encoding fragments of the Streptococcus pneumoniae genome 
useful in producing commercially important proteins, such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer- 

15 based systems, which contain the sequence information described herein. Such 
systems are designed to identify, among other things, commercially important 
fragments of the Streptococcus pneumoniae genome. 

As used herein, "a computer-based system" refers to the hardware means, 
software means, and data storage means used to analyze the nucleotide sequence 

20 information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing 
unit (CPU), input means, output means, and data storage means. A skilled artisan 
can readily appreciate that any one of the currently available computer-based 
systems are suitable for use in the present invention. 

25 As stated above, the computer-based systems of the present invention 

comprise a data storage means having stored therein a nucleotide sequence of the 
present invention and the necessary hardware means and software means for 
supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store 

30 nucleotide sequence information of the present invention, or a memory access 
means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which arc 
implemented on the computer-based system to compare a target sequence or target 

35 structural motif with the sequence information stored within the data storage 
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means. Search means are used to identify fragments or regions of the present 
genomic sequences which match a particular target sequence or target motif. A 
variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the 
5 computer-based systems of the present invention. Examples of such software 
includes, but is not limited to, MacPattem (EMBL), BLASTN and BLASTX 
(NCBIA). A skilled artisan can readily recognize that any one of the available 
algorithms or implementing software packages for conducting homology searches 
can be adapted for use in the present computer-based systems. 

10 As used herein, a "target sequence" can be any DNA or amino acid 

sequence of six or more nucleotides or two or more amino acids. A skilled artisan 
can readily recognize that the longer a target sequence is. the less likely a target 
sequence will be present as a random occurrence in the database. The most 
preferred sequence length of a target sequence is from about 10 to 100 amino acids 

1 5 or from about 30 to 300 nucleotide residues. However, it is well recognized that 
searches for commercially important fragments, such as sequence fragments 
involved in gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) 

20 are chosen based on a three-dimensional configuration which is formed upon the 
folding of the target motif. There are a variety of target motifs known in the art. 
Protein target motifs include, but are not limited to, enzymic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to, promoter 
sequences, hairpin structures and inducible expression elements (protein binding 

25 sequences). 

A variety of structural formats for the input and output means can be used 
to input and output the information in the computer-based systems of the present 
invention. A preferred format for an output means ranks fragments of the 
Streptococcus pneumoniae genomic sequences possessing varying degrees of 

30 homology to the target sequence or target motif. Such presentation provides a 
skilled artisan with a ranking of sequences which contain various amounts of the 
target sequence or target motif and identifies the degree of homology contained in 
the identified fragment. 

A variety of comparing means can be used to compare a target sequence or 

35 target motif with the data storage means to identify sequence fragments of the 



WO 98/18931 PCT/US97/19588 

14 



Streptococcus pneumoniae genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in 
Altschul et a/., /. MoL Biol. 215: 403-410 (1990), is used to identify open reading 
frames within the Streptococcus pneumoniae genome. A skilled artisan can readily 
5 recognize that any one of the publicly available homology search programs can be 
used as the search means for the computer-based systems of the present invention. 
Of course, suitable proprietary systems that may be known to those of skill also 
may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of 

10 embodiments of this aspect of present invention. The computer system 102 
includes a processor 106 connected to a bus 104. Also connected to the bus 104 
are a main memory 108 (preferably implemented as random access memory, RAM) 
and a variety of secondary storage devices i 10, such as a hard drive 1 12 and a 
removable medium storage device 1 14. The removable medium storage device 1 14 

15 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape 
drive, etc. A removable storage medium 1 16 (such as a floppy disk, a compact 
disk, a magnetic tape, etc, ) containing control logic and/or data recorded therein 
may be inserted into the removable medium storage device 1 14. The computer 
system 102 includes appropriate software for reading the control logic and/or the 

20 data from the removable medium storage device 1 14, once it is inserted into the 
removable medium storage device 1 14. 

A nucleotide sequence of the present invention may be stored in a well 
known manner in the main memory 108, any of the secondary storage devices 1 10, 
and/or a removable storage medium 1 16. During execution, software for accessing 

25 and processing the genomic sequence (such as search tools, comparing tools, etc.) 
reside in main memory 108, in accordance with the requirements and operating 
parameters of the operating system, the hardware system and the software program 
or programs. 
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BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention arc directed to isolated 
fragments of the Streptococcus pneumoniae genome. The fragments of the 

5 Streptococcus pneumoniae genome of the present invention include, but are not 
limited to fragments which encode peptides and polypeptides, hereinafter open 
reading frames (ORFs), fragments which modulate the expression of an operabiy 
linked ORF, hereinafter expression modulating fragments (EMFs) and fragments 
which can be used to diagnose the presence of Streptococcus pneumoniae in a 

10 sample, hereinafter diagnostic fragments (DFs). 

As used herein, an 'isolated nucleic acid molecule" or an "isolated fragment 
of the Streptococcus pneumoniae genome" refers to a nucleic acid molecule 
possessing a specific nucleotide sequence which has been subjected to purification 
means to reduce, from the composition, the number of compounds which are 

15 normally associated with the composition. Particularly, the term refers to the 
nucleic acid molecules having the sequences set out in SEQ ID NOS:l-39I ? to 
representative fragments thereof as described above, to polynucleotides at least 
95%, preferably at least 99% and especially preferably at least 99.9% identical in 
sequence thereto, also as set out above. 

20 A variety of purification means can be used to generate the isolated 

fragments of the present invention. These include, but are not limited to methods 
which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment. Streptococcus pneumoniae DNA can be enzymatically 
sheared to produce fragments of 15-20 kb in length. These fragments can then be 

25 used to generate a Streptococcus pneumoniae library by inserting them into lambda 
clones as described in the Examples below. Primers flanking, for example, an 
ORF, such as those enumerated in Tables 1-3 can then be generated using 
nucleotide sequence information provided in SEQ ID NOS; 1-391. Well known 
and routine techniques of PCR cloning then can be used to isolate the ORF from 

30 the lambda DNA library or Streptococcus pneumoniae genomic DNA. Thus, given 
the availability of SEQ ID NOS: 1-391, the information in Tables 1, 2 and 3, and 
the information that may be obtained readily by analysis of the sequences of SEQ 
ID NOS: 1-391 using methods set out above, those of skill will be enabled by the 
present disclosure to isolate any ORF-containing or other nucleic acid fragment of 

35 the present invention. 
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The isolated nucleic acid molecules of the_prescnt invention include, but arc 
not limited to single stranded and double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame/' ORF, means a series of triplets 
coding for amino acids without any termination codons and is a sequence 
5 translatable into protein. 

Tables 1, 2, and 3 list ORFs in the Streptococcus pneumoniae genomic 
contigs of the present invention that were identified as putative coding regions by 
the GeneMark software using organism-specific second-order Markov probability 
transition matrices. It will be appreciated that other criteria can be used, in 
10 accordance with well known analytical methods, such as those discussed herein, to 
generate more inclusive, more restrictive, or more selective lists. 

Table 1 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that over a continuous region of at least 50 bases are 95% or 
more identical (by BLAST analysis) to a nucleotide sequence available through 
15 GenBank in October, 1997. 

Table 2 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that are not in Table 1 and match, with a BLASTP probability 
score of 0.01 or less, a polypeptide sequence available through GenBank in 
October, 1997. 

-0 Table 3 sets out ORFs in the Streptococcus pneumoniae contigs of the 

present invention that do not match significantly, by BLASTP analysis, a 
polypeptide sequence available through GenBank in October, 1997. 

In each table, the first and second columns identify the ORF by, 
respectively, contig number and ORF number within the contig; the third column 

25 indicates the first nucleotide of the ORF (actually the first nucleotide of the stop 
codon immediately proceeding the ORF), counting from the 5* end of the contig 
strand; and the fourth column, "stop (nt)" indicates the last nucleotide of the stop 
codon defining the 3*end of the ORF. 

In Tables 1 and 2, column five, lists the Reference for the closest 

30 matching sequence available through GenBank. These reference numbers are the 
databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the nomenclature are available 
from the National Center for Biotechnology Information. Column six in Tables I 
and 2 provides the gene name of the matching sequence; column seven provides 

35 the BLAST identity score and column eight the BLAST similarity score from the 
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comparison of the ORF and the homologous gene : and column nine indicates the 
length in nucleotides of the highest scoring segment pair identified by the BLAST 
identity analysis. 

Each ORF described in the tables is defined by "start (nt)" (5') and "stop 
5 (nt)" (3*) nucleotide position numbers. These position numbers refer to the 
boundaries of each ORF and provide orientation with respect to whether the 
forward or reverse strand is the coding strand and which reading frame the coding 
sequence is contained. The "start" position is the first nucleotide of the triplet 
encoding a stop codon just 5' to the ORF and the "stop" position is the last 

10 nucleotide of the triplet encoding the next in-frame stop codon (i.e., the stop codon 
at the 3* end of the ORF). Those of ordinary skill in the art appreciate that 
preferred fragments within each ORF described in the table include fragments of 
each ORF which include the entire sequence from the delineated "start" and "stop" 
positions excepting the first and last three nucleotides since these encode stop 

15 codons. Thus, polynucleotides set out as ORFs in the tables but lacking the three 
(3) 5' nucleotides and the three (3) 3* nucleotides are encompassed by the present 
invention. Those of skill also appreciate that particularly preferred are fragments 
within each ORF that are polynucleotide fragments comprising polypeptide coding 
sequence- As defined herein, "coding sequence" includes the fragment within an 

20 ORF beginning at the first in-framc ATG (triplet encoding methionine) and ending 
with the last nucieotide prior to the triplet encoding the 3* stop codon. Preferred 
are fragments comprising the entire coding sequence and fragments comprising the 
entire coding sequence, excepting the coding sequence for the N-terminal 
methionine. Those of skill appreciate that the N-terminal methionine is often 

25 removed during post-translational processing and that polynucleotides lacking the 
ATG can be used to facilitate production of N-termainal fusion proteins which may 
be benefical in the production or use of genetically engineered proteins. Of course, 
due to the degeneracy of the genetic code many polynucleotides can encode a given 
polypeptide. Thus, the invention further includes polynucleotides comprising a 

30 nucleotide sequence encoding a polypeptide sequence itself encoded by the coding 
sequence within an ORF described in Tables 1-3 herein. Further, polynucleotides 
at least 95%, preferably at least 99% and especially preferably at least 99.9% 
identical in sequence to the foregoing polynucleotides, are contemplated by the 
present invention. 
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Polypeptides encoded by polynucleotides described above and elsewhere 
herein are also provided by the present invention as are polypeptide comprising a 
an amino acid sequence at least about 95%, preferably at least 97% and even more 
preferably 99% identical to the amino acid sequence of a polypeptide encoded by an 
5 ORF shown in Tables 1-3. These polypeptides may or may not comprise an N- 
terminal methionine. 

The concepts of percent identity and percent similarity of two polypeptide 
sequences is well understood in the art. For example, two polypeptides 10 amino 
acids in length which differ at three amino acid positions (e.g., at positions 1, 3 

10 and 5) are said to have a percent identity of 70%. However, the same two 
polypeptides would be deemed to have a percent similarity of 80% if, for example 
at position 5, the amino acids moieties, although not identical, were "similar' (i.e., 
possessed similar biochemical characteristics). Many programs for analysis of 
nucleotide or amino acid sequence similarity, such as fasta and BLAST specifically 

15 list percent identity of a matching region as an output parameter. Thus, for 
instance. Tables 1 and 2 herein enumerate the percent identity of the highest 
scoring segment pair in each ORF and its listed relative. Further details 
concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations 

20 provided below. 

It will be appreciated that other criteria can be used to generate more 
inclusive and more exclusive listings of the types set out in the tables. As those of 
skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 
artisan can readily identify ORFs in contigs of the Streptococcus pneumoniae 

25 genome other than those listed in Tables 1-3, such as ORFs which are overlapping 
or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a 
series of nucleotide molecules which modulates the expression of an operably 

30 linked ORF or EMF. 
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As used herein, a sequence is said to "modulate the expression of an 
operabiy linked sequence" when the expression of the sequence is altered by the 
presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs arc 
5 fragments which induce the expression or an operabiy linked ORF in response to a 
specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Streptococcus 
pneumoniae genome by their proximity to the ORFs provided in Tables 1-3. An 
intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 

10 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate 
the expression of an operabiy linked ORF in a fashion similar to that found with the 
naturally linked ORF sequence. As used herein, an "intergenic segment" refers to 
fragments of the Streptococcus pneumoniae genome which are between two 
ORF(s) herein described. EMFs also can be identified using known EMFs as a 

15 target sequence or target motif in the computer-based systems of the present 
invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap 
vector. An EMF trap vector contains a cloning site linked to a marker sequence. A 
marker sequence encodes an identifiable phenotype, such as antibiotic resistance or 

20 a complementing nutrition auxotrophic factor, which can be identified or assayed 
when the EMF trap vector is placed within an appropriate host under appropriate 
conditions. As described above, a EMF will modulate the expression of an 
operabiy linked marker sequence. A more detailed discussion of various marker 
sequences is provided below. A sequence which is suspected as being an EMF is 

25 cloned in all three reading frames in one or more restriction sites upstream from the 
marker sequence in the EMF trap vector. The vector is then transformed into an 
appropriate host using known procedures and the phenotype of the transformed 
host in examined under appropriate conditions. As described above, an EMF will 
modulate the expression of an operabiy linked marker sequence. 

30 As used herein, a "diagnostic fragment," DF, means a series of nucleotide 

molecules which selectively hybridize to Streptococcus pneumoniae sequences. 
DFs can be readily identified by identifying unique sequences within contigs of the 
Streptococcus pneumoniae genome, such as by using well-known computer 
analysis software, and by generating and testing probes or amplification primers 
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consisting of the DF sequence in an appropriate diagnostic format which 
determines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention arc not 
limited to the specific sequences herein described, but also include allelic and 
5 species variations thereof. Allelic and species variations can be routinely 
determined by comparing the sequences provided in SEQ ID NOS: 1-391, a 
representative fragment thereof, or a nucleotide sequence at least 95%, preferrably 
at least 99% and most at least preferably 99.9% identical to SEQ ID NOS: 1-391, 
with a sequence from another isolate of the same species. Furthermore, to 

io accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed 
herein. In other words, in the coding region of an ORF, substitution of one codon 
for another which encodes the same amino acid is expressly contemplated. Any 
specific sequence disclosed herein can be readily screened for errors by 

!5 resequencing a particular fragment, such as an ORF, in both directions (i.e., 
sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Streptococcus pneumoniae origin 
isolated by using part or all of the fragments in question as a probe or primer. 

Preferred DFs of the present invention comprise at least about 17, 

20 preferrably at least about 20, and more preferrably at least about 50 contiguous 
nucleotides within an ORF set out in Tables 1-3. Most highly preferred DFs 
specifically hybridize to a polynucleotide containing the sequence of the ORF from 
which they are derived. Specific hybridization occurs even under stringent 
conditions defined elsewhere herein. 

25 Each of the ORFs of the Streptococcus pneumoniae genome disclosed in 

Tables 1, 2 and 3, and the EMFs found 5* to the ORFs, can be used as 
polynucleotide reagents in numerous ways. For example, the sequences can be 
used as diagnostic probes or diagnostic amplification primers to detect the presence 
of a specific microbe in a sample, particularly Streptococcus pneumoniae. 

30 Especially preferred in this regard are ORFs such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most 
likely to be highly selective for Streptococcus pneumoniae. Also particularly 
preferred are ORFs that can be used to distinguish between strains of Streptococcus 
pneumoniae, particularly those that distinguish medically important strain, such as 

35 drug-resistant strains. 
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In addition, the fragments of the present invention, as broadly described, 
can be used to control gene expression through triple helix formation or antisense 
DNA or RNA, both of which methods are based on the binding of a polynucleotide 
sequence to DNA or RNA. Triple helix-formation optimally results in a shut-off of 
5 RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides. Polynucleotides suitable for use in these methods are 
usually 20 to 40 bases in length and are designed to be complementary to a region 

10 of the gene involved in transcription, for triple-helix formation, or to the mRNA 
itself, for antisense inhibition. Both techniques have been demonstrated to be 
effective in model systems, and the requisite techniques are well known and 
involve routine procedures. Triple helix techniques are discussed in, for example, 
Lee et a/., NucL Acids Res, 6:3073 (1979); Cooney et a/., Science 241:456 

15 (1988); and Dervan et a/.. Science 251: 1360 (1991). Antisense techniques in 
general are discussed in, for instance, Okano, J. Neurochem. 56:560 (1991) and 
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press. 
Boca Raton, FL( 1988)). 

The present invention further provides recombinant constructs comprising 

20 one or more fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention. Certain preferred recombinant constructs of the 
present invention comprise a vector, such as a plasmid or viral vector, into which a 
fragment of the Streptococcus pneumoniae genome has been inserted, in a forward 
or reverse orientation. In the case of a vector comprising one of the ORFs of the 

25 present invention, the vector may further comprise regulatory sequences, including 
for example, a promoter, operably linked to the ORF. For vectors comprising the 
EMFs of the present invention, the vector may further comprise a marker sequence 
or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of 

30 skill in the an and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 
example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, 
pBS KS, pNH8a, pNH16a, pNH18a, pNH46a (available from Stratagene); 
pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (available from Pharmacia). 

35 Useful, eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTL pSG 
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(available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from 
Pharmacia). 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 
5 Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial 
promoters include lad, lacZ, T3, T7, gpt. lambda PR, and trc. Eukaryotic 
promoters include CMV immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein- I. Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the art. 

'0 The present invention further provides host cells containing any one of the 

isolated fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host 
cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or 

1 5 a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct 
comprising an ORF of the present invention, may be introduced into the host by a 
variety of well established techniques that are standard in the art, such as calcium 
phosphate transfection, DEAE, dextran mediated transfection and electroporation. 

20 which are described in. for instance. Davis, L. et aL % BASIC METHODS IN 
MOLECULAR BIOLOGY (1986). 

A host cell containing one of the fragments of the Streptococcus 
pneumoniae genomic fragments and contigs of the present invention, can be used 
in conventional manners to produce the gene product encoded by the isolated 

25 fragment (in the case of an ORF) or can be used to produce a heterologous protein 
under the control of the EMF. The present invention further provides 

isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present 
invention. By "degenerate variant" is intended nucleotide fragments which differ 

30 from a nucleic acid fragment of the present invention (e.g.. an ORF) by nucleotide 
sequence but, due to the degeneracy of the Genetic Code, encode an identical 
polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs and 
subfragments thereof depicted in Tables 2 and 3 which encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any 
one of the isolated polypeptides or proteins of the present invention. At the 
simplest level, the amino acid sequence can be synthesized using commercially 
available peptide synthesizers. This is particularly useful in producing small 
5 peptides and fragments of larger polypeptides. -Such short fragments as may be 
obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from 
bacterial cells which naturally produce the polypeptide or protein. One skilled in 

10 the art can readily employ well-known methods for isolating polypeptides and 
proieins to isolate and purify polypeptides or proteins of the present invention 
produced naturally by a bacterial strain, or by other methods. Methods for 
isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion- 

15 exchange chromatography, and immuno-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified 
from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or 
protein when the cell, through genetic manipulation, is made to produce a 

20 polypeptide or protein which it normally does not produce or which the cell 
normally produces at a lower level. Those skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic 
sequences into eukaryotic or prokaryotic cells in order to generate a cell which 
produces one of the polypeptides or proteins of the present invention. 

25 Any host/vector system can be used to express one or more of the ORFs of 

the present invention. These include, but are not limited to, eukaryotic hosts such 
as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic] host 
such as E. coli and B. subtilis. The most preferred cells are those which do not 
normally express the particular polypeptide or protein or which expresses the 

30 polypeptide or protein at low natural level. 
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"Recombinant," as used herein, means that a polypeptide or protein is 
derived from recombinant {e.g., microbial or mammalian) expression systems. 
"Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal {e.g., yeast) expression systems. As a product, "recombinant 

5 microbiaTdefines a polypeptide or protein essentially free of native endogenous 
substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of 
glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern different from that expressed in mammalian cells. 

10 "Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. 

Generally, DNA segments encoding the polypeptides and proteins provided by this 
invention are assembled from fragments of the Streptococcus pneumoniae genome 
and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a 
synthetic gene which is capable of being expressed in a recombinant transcriptional 

15 unit comprising regulatory elements derived from a microbial or viral operon. 

Recombinant expression vehicle or vector" refers to a plasmid or phage or 
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. The 
expression vehicle can comprise a transcriptional unit comprising an assembly of 
(1) a genetic regulatory elements necessary for gene expression in the host, 

20 including elements required to initiate and maintain transcription at a level sufficient 
for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a 
structural or coding sequence which is transcribed into mRNA and translated into 
protein, and (3) appropriate signals to initiate translation at the beginning of the 

25 desired coding region and terminate translation at its end. Structural units intended 
for use in yeast or eukaryotic expression systems preferably include a leader 
sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 
sequence, it may include an N-terminal methionine residue. This residue may or 

30 may not be subsequently cleaved from the expressed recombinant protein to 
provide a final product. 

"Recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic 

35 or eukaryotic. Recombinant expression systems as defined herein will express 
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heterologous polypeptides or proteins upon induction of the regulatory elements 
linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 
other cells under the control of appropriate promoters. Cell-free translation 

5 systems can also be employed to produce such proteins using RNAs derived from 
the DNA constructs of the present invention. Appropriate cloning and expression 
vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et 
at. % Molecular Cloning: A Laboratory Manual 2 nc * Edition, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which 

10 is hereby incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, e.g., 
the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a 

15 downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous 
structural sequence is assembled in appropriate phase with translation initiation and 
termination sequences, and preferably, a leader sequence capable of directing 

20 secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an N- 
terminal identification peptide imparting desired characteristics, e.g., stabilization 
or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a 

25 structural DNA sequence encoding a desired protein together with suitable 
translation initiation and termination signals in operable reading phase with a 
functional promoter. The vector will comprise one or more phenotypic selectable 
markers and an origin of replication to ensure maintenance of the vector and. when 
desirable, provide amplification within the host. 

30 Suitable prokaryotic hosts for transformation include strains of E. coli, B. 

subtilis. Salmonella typhimurium and various species within the genera 
Pseudomonas and Streptomyces. Others may, also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
35 bacterial use can comprise a selectable marker and bacteria] origin of replication 
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derived from commercially available plasmids comprising genetic elements of the 
well known cloning vector pBR322 (ATCC 37017). Such commercial vectors 
include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, 
Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, WI, 

5 USA). These pBR322 "backbone" sections -are combined with an appropriate 
promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter, where it is inducible, is 
derepressed or induced by appropriate means {e.g., temperature shift or chemical 

10 induction) and cells are cultured for an additional period to provide for expression 
of the induced gene product. Thereafter cells are typically harvested, generally by 
centrifugation, disrupted to release expressed protein, generally by physical or 
chemical means, and the resulting crude extract is retained for further purification. 
Various mammalian cell culture systems can also be employed to express 

15 recombinant protein. Examples of mammalian expression systems include the 
COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 2J.175 
(1981), and other cell lines capable of expressing a compatible vector, for example, 
the C127, 3T3, CHO, HeLa and BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a 

20 suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5* flanking nontranscribed sequences. DNA sequences derived 
from the SV40 viral genome, for example, S V40 origin, early promoter, enhancer, 
splice, and polyadenylation sites may be used to provide the required 

25 nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is 
usually isolated by initial extraction from cell pellets, followed by one or more 
salting-out, aqueous ion exchange or size exclusion chromatography steps. 
Microbial cells employed in expression of proteins can be disrupted by any 

30 convenient method, including freeze-thaw cycling, sonication, mechanical 
disruption, or use of cell lysing agents. Protein refolding steps can be used, as 
necessary, in completing configuration of the mature protein. Finally, high 
performance liquid chromatography (HPLC) can be employed for final purification 
steps. 
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The present invention further includes isolated polypeptides, proteins and 
nucleic acid molecules which are substantially equivalent to those herein described. 
As used herein, substantially equivalent can refer both to nucleic acid and amino 
acid sequences, for example a mutant sequence, that varies from a reference 

5 sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between reference and 
subject sequences. For purposes of the present invention, sequences having 
equivalent biological activity, and equivalent expression characteristics arc 
considered substantially equivalent. For purposes of determining equivalence, 

to truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homoiogs from other 
strains of Streptococcus pneumoniae, of the fragments of the Streptococcus 
pneumoniae genome of the present invention and homoiogs of the proteins encoded 
by the ORFs of the present invention. As used herein, a sequence or protein of 

15 Streptococcus pneumoniae is defined as a homoiog of a fragment of the 
Streptococcus pneumoniae fragments or contigs or a protein encoded by one of the 
ORFs of the present invention, if it shares significant homology to one of the 
fragments of the Streptococcus pneumoniae genome of the present invention or a 
protein encoded by one of the ORFs of the present invention. Specifically, by 

20 using the sequence disclosed herein as a probe or as primers, and techniques such 
as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain 
homoiogs. 

As used herein, two nucleic acid molecules or proteins are said to "share 
significant homology'* if the two contain regions which possess greater than 85% 

25 sequence (amino acid or nucleic acid) homology. Preferred homoiogs in this 
regard are those with more than 90% homology. Especially preferred are those 
with 93% or more homology. Among especially preferred homoiogs those with 
95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those 

30 are homoiogs with 99% or more homology. The most preferred homoiogs among 
these are those with 99.9% homology or more. It will be understood that, among 
measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence 
provided in SEQ ID NOS: 1-391 or from a nucleotide sequence at least 95%, 

35 particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
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ID NOS:l-39l can be used to prime DNA synthesis and PCR amplification, as 
well as to identify colonies containing cloned DNA encoding a homolog. Methods 
suitable to this aspect of the present invention are well known and have been 
described in great detail in many publications such as, for example, Innis et aL, 
5 PCR Protocols, Academic Press, San Diego, CA (1990)). 

When using primers derived from SEQ ID NOS: 1-391 or from a nucleotide 
sequence having an aforementioned identity to a sequence of SEQ ED NOS: 1-391, 
one skilled in the an will recognize that by employing high stringency conditions 
(e.g.. annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 

10 65°C in 0.5X SSPC) only sequences which are greater than 75% homologous to 
the primer will be amplified. By employing lower stringency conditions (e.g., 
hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C 
in 0.5X SSPC), sequences which are greater than 40-50% homologous to the 
primer will also be amplified. 

15 When using DNA probes derived from SEQ ED NOS: 1-391, or from a 

nucleotide sequence having an aforementioned identity to a sequence of SEQ ED 
NOS: 1-391, for colony/plaque hybridization, one skilled in the art will recognize 
that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X 
SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences 

20 having regions which are greater than 90% homologous to the probe can be 
obtained, and that by employing lower stringency conditions (e.g., hybridizing at 
35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X 
SSPC), sequences having regions which are greater than 35-45% homologous to 
the probe will be obtained. 

25 Any organism can be used as the source for homologs of the present 

invention so long as the organism naturally expresses such a protein or contains 
genes encoding the same. The most preferred organism for isolating homologs are 
bacteria which are closely related to Streptococcus pneumoniae. 

30 ILLUSTRATIVE USES OF COMPOSITIONS OF THE 

INVENTION 

Each ORF provided in Tables 1 and 2 is identified with a function by 
homology to a known gene or polypeptide. As a result, one skilled in the art can 
use the polypeptides of the present invention for commercial, therapeutic and 
35 industrial purposes consistent with the type of putative identification of the 
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polypeptide. Such identifications permit one__skjIled in the art to use the 
Streptococcus pneumoniae ORFs in a manner similar to the known type of 
sequences for which the identification is made; for example, to ferment a particular 
sugar source or to produce a particular metabolite. A variety of reviews illustrative 
5 of this aspect of the invention are available, including the following reviews on the 
industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND 
BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY 
(1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et aL y Eds., 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). A variety of 
10 exemplary uses that illustrate this and similar aspects of the present invention are 
discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic 

15 reactions involved in intermediary and macromolecular metabolism, the 
biosynthesis of small molecules, cellular processes and other functions includes 
enzymes involved in the degradation of the intermediary products of metabolism, 
enzymes involved in central intermediary metabolism, enzymes involved in 
respiration, both aerobic and anaerobic, enzymes involved in fermentation, 

20 enzymes involved in ATP proton motor force conversion, enzymes involved in 
broad regulatory function, enzymes involved in amino acid synthesis, enzymes 
involved in nucleotide synthesis, enzymes involved in cofactor and vitamin 
synthesis, can be used for industrial biosynthesis. 

The various metabolic pathways present in Streptococcus pneumoniae can 

25 be identified based on absolute nutritional requirements as well as by examining the 
various enzymes identified in Table 1-3 and SEQ ID NOS: 1-391. 

Of particular interest are polypeptides involved in the degradation of 
intermediary metabolites as well as non-mac romolecular metabolism. Such 
enzymes include amylases, glucose oxidases, and catalase. 

30 Proteolytic enzymes are another class of commercially important enzymes. 

Proteolytic enzymes find use in a number of industrial processes including the 
processing of flax and other vegetable fibers, in the extraction, clarification and 
depectinization of fruit juices, in the extraction of vegetables' oil and in the 
maceration of fruits and vegetables to give unicellular fruits. A detailed review of 

35 the proteolytic enzymes, used in the food industry is provided in Rombouts et al t 
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Symbiosis 21:19 (1986) and Voragen et al. in Biocatalysts In Agricultural 
Biotechnology, Whitaker et a/., Eds., American Chemical Society Symposium 
Series 589:93(1989). 

The metabolism of sugars is an important aspect of the primary metabolism 

5 of Streptococcus pneumoniae. Enzymes involved in the degradation of sugars, 
such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from 
a commercial viewpoint, include sugar isomerases such as glucose isomerase. 
Other metabolic enzymes have found commercial use such as glucose oxidases 

10 which produces ketogulonic acid (KG A). KG A is an intermediate in the 
commercial production of ascorbic acid using the Reichstein's procedure, as 
described in Krueger et a/.. Biotechnology 6(A) . Rhine et a/., Eds., Verlag Press. 
Weinhcim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in 

15 purified form as well as in an immobilized form for the deoxygenation of beer. 
See, for instance, Hartmetr et a/., Biotechnology Letters 7:21 (1979). The most 
important application of GOD is the industrial scale fermentation of gluconic acid. 
Market for gluconic acids which are used in the detergent, textile, leather, 
photographic, pharmaceutical, food, feed and concrete industry, as described, for 

20 example, in Bigelis et a/., beginning on page 357 in GENE MANIPULATIONS 
AND FUNGI; Benett et aL Eds., Academic Press, New York (1985). In addition 
to industrial applications, GOD has found applications in medicine for quantitative 
determination of glucose in body fluids recently in biotechnology for analyzing 
syrups from starch and cellulose hydrosylates. This application is described in 

25 Owusu et a/., Biochem. et Biophysica. Acta. S72;83 (1986), for instance. 

The main sweetener used in the world today is sugar which comes from 
sugar beets and sugar cane. In the field of industrial enzymes, the glucose 
isomerase process shows the largest expansion in the market today. Initially, 
soluble enzymes were used and later immobilized enzymes were developed 

30 (Krueger era/.. Biotechnology, The Textbook of Industrial Microbiology. Sinauer 
Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of 
glucose- produced high fructose syrups is by far the largest industrial business 
using immobilized enzymes. A review of the industrial use of these enzymes is 
provided by Jorgensen, Starch 40:307 (1988). 
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Proteinases, such as alkaline serine proteinases, are used as detergent 
additives and thus represent one of the largest volumes of microbial enzymes used 
in the industrial sector. Because of their industrial importance, there is a large body 
of published and unpublished information regarding the use of these enzymes in 
5 industrial processes. (See Faultman et a/. t Acid Proteases Structure Function and 
Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et al. y 
Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et at,. 
Report Industrial Enzymes by 1990, Hel Hepner & Associates, London ( 1986)). 

Another class of commercially usable proteins of the present invention are 

10 the microbial lipases, described by, for instance, Macrae et a/.. Philosophical 
Transactions of the Chiral Society of London 310:227 ( 1985) and Poserke, Journal 
of the American Oil Chemist Society 61: 1758 (1984). A major use of lipases is in 
the fat and oil industry for the production of neutral giycerides using lipase 
catalyzed inter-esterification of readily available triglycerides. Application of 

15 lipases include the use as a detergent additive to facilitate the removal of fats from 
fabrics in the course of the washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for 
key steps in the synthesis of complex organic molecules is gaining popularity at a 
great rate. One area of great interest is the preparation of chiral intermediates. 

20 Preparation of chiral intermediates is of interest to a wide range of synthetic 
chemists particularly those scientists involved with the preparation of new 
pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et a/.. Recent 
Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, 
Boca Raton, Florida (1990)). The following reactions catalyzed by enzymes are of 

25 interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, 
amides and nitriles, esterification reactions, trans-esterification reactions, synthesis 
of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to 
carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming 
reactions such as the aldoi reaction. 

30 When considering the use of an enzyme encoded by one of the ORFs of the 

present invention for biotransformation and organic synthesis it is sometimes 
necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole 
cell system on the one hand or an isolated partially purified enzyme on the other 
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hand, has been described in detail by Bud et ai. Chemistry in Britain (1987), p. 
127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism 
of amino acids, are useful in the catalytic production of amino acids. The 

5 advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and 
generally possess uniformly high catalytic rates. A description of the use of amino 
transferases for amino acid production is provided by Roselle-David, Methods of 
Enzymology 136:419 (1987). 

10 Another category of useful proteins encoded by the ORFs of the present 

invention include enzymes involved in nucleic acid synthesis, repair, and 
recombination. 

2. Generation of Antibodies 

15 As described here, the proteins of the present invention, as well as 

homologs thereof, can be used in a variety of procedures and methods known in 
the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the 
protein. Such antibodies can be either monoclonal or polyclonal antibodies, as well 

20 fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of 
the proteins of the present invention and hybridomas which produce these 
antibodies. A hybridoma is an immortalized cell line which is capable of secreting 
a specific monoclonal antibody. 

25 In general, techniques for preparing polyclonal and monoclonal antibodies 

as well as hybridomas capable of producing the desired antibody are well known in 
the art (Campbell, A. M., Monocbnal Antibody Technology: Laboratory 
Techniques In Biochemistry And Molecular Biology, Elsevier Science Publishers, 
Amsterdam, The Netherlands ( 1984); St Groth et aL. / Immunol Methods 35: I - 

30 21 (1980), Kohler and Milstein, Nature 256:495-497 (1975)), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et ai, Immunology 
Today 4:72 (1983), pgs. 77-96 of Cole et ai, in Monoclonal Antibodies And 
Cancer Therapy, Alan R. Liss, Inc. (1985)). Any animal (mouse, rabbit, 

etc.) which is known to produce antibodies can be immunized with the pseudogene 

35 polypeptide. Methods for immunization are well known in the art. Such methods 
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include subcutaneous or interperitoneal injection of the polypeptide. One skilled in 
the art will recognize that the amount of the protein encoded by the ORF of the 
present invention used for immunization will vary based on the animal which is 
immunized, the antigenicity of the peptide and the site of injection. 
5 The protein which is used as an immunogen may be modified or 

administered in an adjuvant in order to increase the protein's antigenicity. Methods 
of increasing the antigenicity of a protein are well known in the art and include, but 
are not limited to coupling the antigen with a heterologous protein (such as globulin 
or galactosidase) or through the inclusion of an adjuvant during immunization. 

10 For monoclonal antibodies, spleen cells from the immunized animals are 

removed, fused with myeloma cells, such as SP2/0-Agi4 myeloma cells, and 
allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to 
identify the hybridoma cell which produces an antibody with the desired 

15 characteristics. These include screening the hybridomas with an ELISA assay, 
western blot analysis, or radioimmunoassay (Lutz et al. % Exp. Cell Res. 1 75: 109- 
124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and 
subclass is determined using procedures known in the art (Campbell, A. M., 
20 Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and 
Molecular Biology, Elsevier Science Publishers. Amsterdam, The Netherlands 
(1984)). 

Techniques described for the production of single chain antibodies (U. S. 
Patent 4,946,778) can be adapted to produce single chain antibodies to proteins of 
25 the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the 
immunized animal and is screened for the presence of antibodies with the desired 
specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in 
30 detectably labelled form. Antibodies can be detectably labelled through the use of 
radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such 
as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as 
FITC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing 
such labeling are well-known in the art, for example see Stemberger et a/., J. 
35 Histochem. Cywchem. 75:315 (1970); Bayer. E. A. et ai, Meth. Enzym. 62:308 
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(1979); Engval, E. et al. t Immunol. 109: 129 (1972); Coding, J. W., 7. Immunol. 
Meth, 75:215(1976)). 

The labeled antibodies of the present invention can be used for in vitro, in 
vivo, and in situ assays to identify cells or tissues in which a fragment of the 

5 Streptococcus pneumoniae genome is expressed. 

The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics 
such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for 

10 coupling antibodies to such solid supports are well known in the art (Weir. D. M. 
et ai, "Handbook of Experimental Immunology" 4th Ed.. Blackwell Scientific 
Publications, Oxford. England, Chapter 10 (1986); Jacoby, W. D. et a/., Meth. 
Enzym. 34 Academic Press, N. Y. (1974)). The immobilized antibodies of the 
present invention can be used for in vitro, in vivo, and in situ assays as well as for 

15 immunoaffinity purification of the proteins of the present invention. 

3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, 

20 using one of the DFs or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one or more 
of the antibodies or one or more of the DFs of the present invention and assaying 
for binding of the DFs or antibodies to components within the test sample. 

Conditions for incubating a DF or antibody with a test sample vary. 

25 Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the DF or antibody used in the 
assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted 
to employ the DFs or antibodies of the present invention. Examples of such assays 

30 can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); 
Bullock, G. R. et a/., Techniques in Immunocytochemistry, Academic Press, 
Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and 
Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and 
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Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1985). 

The test samples of the present invention include cells, protein or membrane 
extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 

5 urine. The test sample used in the above-described method will vary based on the 
assay format, nature of the detection method and the tissues, cells or extracts used 
as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to 
obtain a sample which is compatible with the system utilized. 

10 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the DFs or antibodies of the present invention; and (b) one or 

1 5 more other containers comprising one or more of the following: wash reagents, 
reagents capable of detecting presence of a bound DF or antibody. 

In detail, a compartmentalized kit includes any kit in which reagents arc 
contained in separate containers. Such containers include small glass containers, 
plastic containers or strips of plastic or paper. Such containers allows one to 

20 efficiently transfer reagents from one compartment to another compartment such 
that the samples and reagents are not cross-contaminated, and the agents or 
solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept 
the test sample, a container which contains the antibodies used in the assay, 

25 containers which contain wash reagents (such as phosphate buffered saline, Tris- 
buffers, etc.)* and containers which contain the reagents used to detect the bound 
antibody or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled 
secondary antibodies, or in the alternative, if the primary antibody is labelled, the 
30 enzymatic, or antibody binding reagents which are capable of reacting with the 
labelled antibody. One skilled in the ait will readily recognize that the disclosed 
DFs and antibodies of the present invention can be readily incorporated into one of 
the established kit formats which are well known in the art. 

35 4. Screening Assay for Binding Agents 
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Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a 
protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Streptococcus pneumoniae fragment and contigs herein 
5 described. 

In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the 
ORFs of the present invention, or an isolated fragment of the Streptococcus 
pneumoniae genome; and 

10 (b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, 
peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The 
agents can be selected and screened at random or rationally selected or designed 
using protein modeling techniques. 

15 For random screening, agents such as peptides, carbohydrates, 

pharmaceutical agents and the like are selected at random and are assayed for their 
ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used 
herein, an agent is said to be "rationally selected or designed" when the agent is 

20 chosen based on the configuration of the particular protein. For example, one 
skilled in the art can readily adapt currently available procedures to generate 
peptides, pharmaceutical agents and the like capable of binding to a specific peptide 
sequence in order to generate rationally designed antipeptide peptides, for example 
see Hurby et a/., "Application of Synthetic Peptides: Antisense Peptides," in 

25 Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, 
and Kaspczak et aL 9 Biochemistry 25:9230-8 ( 1 989), or pharmaceutical agents, or 
the like. 

In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used 1 to control gene expression through binding to one 
30 of the ORFs or EMFs of the present invention. As described above, such agents 
can be randomly screened or rationally designed/selected. Targeting the ORF or 
EMF allows a skilled artisan to design sequence specific or element specific agents, 
modulating the expression of either a single ORF or multiple ORFs which rely on 
the same EMF for expression control. 
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One class of DNA binding agents are agents which contain base residues 
which hybridize or form a triple helix by binding to DNA or RNA. Such agents 
can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a 
variety of sulfhydryi or polymeric derivatives which have base attachment capacity. 

5 Agents suitable for use in these methods usually contain 20 to 40 bases and 

are designed to be complementary to a region of the gene involved in transcription 
(triple helix - see Lee era/., Nuci Acids Res. 6:3073 (1979); Cooney et ai. 
Science 241:456 (1988); and Dervan et al. % Science 257:1360 (1991)) or to the 
mRNA itself (antisense - Okano, J. Neurochem. 56:560 (1991); 

!0 Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression. CRC Press, 
Boca Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of 
RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the 

15 sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be 

20 used to modulate the growth or pathogenicity of Streptococcus pneumoniae, or 
another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known 
techniques to provide a pharmaceutical compositions. As used herein, the 
"pharmaceutical agents of the present invention" refers the pharmaceutical agents 

25 which are derived from the proteins encoded by the ORFs of the present invention 
or are agents which are identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth 
pathogenicity of Streptococcus pneumoniae or a related organism, in vivo or in 
vitro" when the agent reduces the rate of growth, rate of division, or viability of 

30 the organism in question. The pharmaceutical agents of the present invention can 
modulate the growth or pathogenicity of an organism in many fashions, although 
an understanding of the underlying mechanism of action is not needed to practice 
the use of the pharmaceutical agents of the present invention. Some agents will 
modulate the growth by binding to an important protein thus blocking the biological 

35 activity of the protein, while other agents may bind to a component of the outer 
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surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a 
protein encoded by one of the ORFs of the present invention and serve as a 
vaccine. The development and use of a vaccine based on outer membrane 

5 components are well known in the art. — 

As used herein, a "related organism" is a broad term which refers to any 
organism whose growth can be modulated by one of the pharmaceutical agents of 
the present invention. In general, such an organism will contain a homolog of the 
protein which is the target of the pharmaceutical agent or the protein used as a 

10 vaccine. As such, related organisms do not need to be bacterial but may be fungal 
or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may 
be administered in a convenient manner, such as by the oral, topical, intravenous, 
intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The 

15 pharmaceutical compositions are administered in an amount which is effective for 
treating and/or prophylaxis of the specific indication. In general, they are 
administered in an amount of at least about 1 mg/feg body weight and in most cases 
they will be administered in an amount not in excess of about 1 g/kg body weight 
per day. In most cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body 

20 weight daily, taking into account the routes of administration, symptoms, etc. 

The agents of the present invention can be used in native form or can be 
modified to form a chemical derivative. As used herein, a molecule is said to be a 
"chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the 

25 molecule's solubility, absorption, biological half life, etc. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such 
effects are disclosed in, among other sources, REMINGTON'S 
PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

30 For example, such moieties may change an immunological character of the 

functional derivative, such as affinity for a given antibody. Such changes in 
immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox 
or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic 

35 degradation or the tendency to aggregate with carriers or into multimers also may 
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be effected in this way and can be assayed by methods well known to the skilled 
artisan. 

The therapeutic effects of the agents of the present invention may be 
obtained by providing the agent to a patient by any suitable means (e.g., inhalation, 
5 intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is 
preferred to administer the agent of the present invention so as to achieve an 
effective concentration within the blood or tissue in which the growth of the 
organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be 
10 by continuous infusion, or by single or multiple injections. 

In providing a patient with one of the agents of the present invention, the 
dosage of the administered agent will vary depending upon such factors as the 
patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of 
15 agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of 
patient), although a lower or higher dosage may be administered. The 
therapeutically effective dose can be lowered by using combinations of the agents 
of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be 
20 administered "in combination" with each other when either (1) the physiological 
effects of each compound, or (2) the serum concentrations of each compound can 
be measured at the same time. The composition of the present invention can be 
administered concurrently with, prior to. or following the administration of the 
other agent. 

25 The agents of the present invention are intended to be provided to recipient 

subjects in an amount sufficient to decrease the rate of growth (as defined above) of 
the target organism. 

The administration of the agent(s) of the invention may be for either a 
"prophylactic* or "therapeutic" purpose. When provided prophylactically, the 

30 agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, 
attenuate, or decrease the rate of onset of any subsequent infection. When 
provided therapeutically, the agent(s) are provided at (or shortly after) the onset of 
an indication of infection. The therapeutic administration of the compound(s) 
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serves to attenuate the pathological symptoms of the infection and to increase the 
rate of recovery. 

The agents of the present invention are administered to a subject, such as a 
mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically 
5 effective concentration. A composition is said to be "pharmacologically acceptable" 
if its administration can be tolerated by a recipient patient. Such an agent is said to 
be administered in a "therapeutically effective amount" if the amount administered 
is physiologically significant. An agent is physiologically significant if its presence 
results in a detectable change in the physiology of a recipient patient. 

10 The agents of the present invention can be formulated according to known 

methods to prepare pharmaceutically useful compositions, whereby these materials, 
or their functional derivatives, are combined in a mixture with a pharmaceutically 
acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of 
other human proteins, e.g., human serum albumin, are described, for example, in 

15 REMINGTON'S PHARMACEUTICAL SCIENCES. 16 th Ed., Osol, A., Ed.. 
Mack Publishing, Easton PA (1980). In order to form a pharmaceutically 
acceptable composition suitable for effective administration, such compositions will 
contain an effective amount of one or more of the agents of the present invention, 
together with a suitable amount of carrier vehicle. 

20 Additional pharmaceutical methods may be employed to control the duration 

of action. Control release preparations may be achieved through the use of 
polymers to complex or absorb one or more of the agents of the present invention. 
The controlled delivery may be effectuated by a variety of well known techniques, 
including formulation with macromolecules such as, for example, polyesters, 

25 polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, 
carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the 
macromolecules and the agent in the formulation, and by appropriate use of 
methods of incorporation, which can be manipulated to effectuate a desired time 
course of release. Another possible method to control the duration of action by 

30 controlled release preparations is to incorporate agents of the present invention into 
particles of a polymeric material such as polyesters, polyamino acids, hydrogels, 
poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is possible to entrap these 
materials in microcapsules prepared, for example, by coacervation techniques or by 

35 interfacial polymerization with, for example, hydroxymethylcellulose or gelatine- 
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microcapsules and poly(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, albumin microspheres, 
microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such 
techniques are disclosed in REMINGTON S PHARMACEUTICAL SCIENCES 
5 (1980). 

The invention further provides a pharmaceutical pack or kit comprising one 
or more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such container(s) can be a notice in 
the form prescribed by a governmental agency regulating the manufacture, use or 
10 sale of pharmaceuticals or biological products, which notice reflects approval by 
the agency of manufacture, use or sale for human administration. 

In addition, the agents of the present invention may be employed in 
conjunction with other therapeutic compounds. 

15 6. Shot-Gun Approach to Megabase DNA Sequencing 

The present invention further demonstrates that a large sequence can be 
sequenced using a random shotgun approach. This procedure, described in detail 
in the examples that follow, has eliminated the up front cost of isolating and 
ordering overlapping or contiguous subclones prior to the start of the sequencing 

20 protocols. 

Certain aspects of the present invention are described in greater detail in the 
examples that follow. The examples are provided by way of illustration. Other 
aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present 
25 disclosure. 

ILLUSTRATIVE EXAMPLES 

LIBRARIES AND SEQUENCING 
30 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing 

follows from the Lander and Waterman (Landerman and Waterman, Genomics 

2:231 (1988)) application of the equation for the Poisson distribution. According 

to this treatment, the probability, P , that any given base in a sequence of size L, in 

35 nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random 

0 
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sequence has been determined can be calculated by the equation P = e~ m , where m 
is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=l when 2.8 
Mb of sequence has been randomly generated (IX coverage). A?that point, P = 
e~l = 0.37. The probability that any given base has not been sequenced is the same 
5 as the probability that any region of the whole sequence L has not been determined 
and, therefore, is equivalent to the fraction of the whole sequence that has yet to be 
determined. Thus, at one-fold coverage, approximately 37% of a polynucleotide of 
size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been 
generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to 
10 .0067 or 0.67%. 5X coverage of a 2.8 Mb sequence can be attained by sequencing 
approximately 17,000 random clones from both insert ends with an average 
sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le* m . 
and the average gap size, g, follows the equation, g = L/n. Thus, 5X coverage 
15 leaves about 240 gaps averaging about 82 bp in size in a sequence of a 
polynucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 
2: 231 (1988). 

20 2. Random Library Construction 

In order to approximate the random model described above during actual 
sequencing, a nearly ideal library of cloned genomic fragments is required. The 
following library construction procedure was developed to achieve this end. 

Streptococcus pneumoniae DNA is prepared by phenol extraction. A 

25 mixture containing 200 \ig DNA in 1.0 ml of 300 mM sodium acetate, 10 mM Tris- 
HCI, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical 
Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The 
sonicated DNA is ethanol precipitated and redissolved in 500 jil TE buffer. 

To create blunt-ends, a 100 ^il aliquot of the resuspended DNA is digested 

30 with 5 units of BAL3 1 nuclease (New England BioLabs) for 10 min at 30°C in 200 
pJ BAL31 buffer. The digested DNA is phenol-extracted, ethanol-precipitated, 
redissolved in 100 ^1 TE buffer, and then size-fractionated by electrophoresis 
through a 1.0% low melting temperature agarose gel. The section containing DNA 
fragments 1.6-2.0 kb in size is excised from the gel, and the LGT agarose is melted 

35 and the resulting solution is extracted with phenol to separate the agarose from the 
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DNA. DNA is ethanol precipitated and redissoJved in 20 \il of TE buffer for 
ligation to vector. 

A two-step ligation procedure is used to produce a plasmid library with 
97% inserts, of which >99% were single inserts. The first ligation mixture (50 ul) 

5 contains 2 |ig of DNA fragments, 2 \ig pUCi8 DNA (Pharmacia) cut with Smal 
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase 
(GIBCO/BRL) and is incubated at I4°C for 4 hr. The ligation mixture then is 
phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 
20 ul TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete 

10 bands in a ladder are visualized by ethidium bromide-staining and UV illumination 
and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of 
the gel containing v+I DNA is excised and the v+I DNA is recovered and 
resuspended into 20 \xl TE. The v+I DNA then is blunt-ended by T4 polymerase 
treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears, 

15 500 pM each of the 4 dNTPs, and 9 units of T4 polymerase (New England 
BioLabs), under recommended buffer conditions. After phenol extraction and 
ethanol precipitation the repaired v+I linears are dissolved in 20 jil TE. The final 
ligation to produce circles is carried out in a 50 \il reaction containing 5 \il of v+I 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the 

20 following day, the reaction mixture is stored at -20°C 

This two-stage procedure results in a molecularly random collection of 
single-insert plasmid recombinants with minimal contamination from double-insert 
chimeras (<)%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in 

25 the host, E. coli host cells deficient in all recombination and restriction functions 
(A. Greener, Strategies 3 (1):5 (1990)) are used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells are 
plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase 
which allows multiplication and selection of the most rapidly growing cells. 

30 Plating is carried out as follows. A 100 jil aliquot of Epicurian Coli SURE 

II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a 
chilled Falcon 2059 tube on ice. A 1.7 |il aliquot of 1.42 M beta-mercaptoethanol 
is added to the aliquot of cells to a final concentration of 25 mM. Cells are 
incubated on ice for 10 min. A I \i\ aliquot of the final ligation is added to the cells 

35 and incubated on ice for 30 min. The cells are heat pulsed for 30 sec. at 42°C and 
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placed back on ice for 2 min. The outgrowth period, in liquid culture is eliminated 
from this protocol in order to minimize the preferential growth of any given 
transformed cell. Instead the transformation mixture is plated directly on a nutrient 
rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar 20 g 
5 tryptone, 5 g yeast extract, 0.5 g NaCl, 1.5% Difco Agar per liter of media). The 5 
ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml 
SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal 
(2%), 1 ml MgCl (1 M), and 1 ml MgSO / 100 ml SOB agar. The 15 ml top layer 
is poured just prior to plating. Our titer is approximately 100 colonies/ 10 [il aliquot 
10 of transformation? * 

All colonies are picked for template preparation regardless of size. Thus, 
only clones lost due to "poison" DNA or deleterious gene products are deleted from 
the library, resulting in a slight increase in gap number over that expected. 

15 3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates are prepared using a 
"boiling bead" method developed in collaboration with Advanced Genetic 
Technology Corp. (Gaithersburg, MD) (Adams et a/., Science 252:1651 (1991); 
Adams et aL, Nature 355:632 (1992)). Plasmid preparation is performed in a 96- 

20 well format for all stages of DNA preparation from bacterial growth through final 
DNA purification. Template concentration is determined using Hoechst Dye and a 
Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding 
templates are identified where possible and not sequenced. 

Templates are also prepared from two Streptococcus pneumoniae lambda 

25 genomic libraries. An amplified library is constructed in the vector Lambda GEM- 
12 (Promega) and an unamplified library is constructed in Lambda DASH II 
(Stratagene). In particular, for the unamplified lambda library. Streptococcus 
pneumoniae DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) 
containing 50 |Hg DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. 

30 The digested DNA was phenol-extracted and electrophoresed on a 0.5% low 
melting agarose gel at 2V/crn for 7 hours. Fragments from 15 to 25 kb are excised 
and recovered in a final volume of 6 ul. One of fragments is used with 1 jil of 
DASHII vector (Stratagene) in the recommended ligation reaction. One \il of the 
ligation mixture is used per packaging reaction following the recommended 

35 protocol with the Gigapack II XL Packaging Extract (Stratagene, #2277 1 1 ). Phage 
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are plated directly without amplification from the packaging mixture (after dilution 
with 500 ill of recommended SM buffer and chloroform treatment). Yield is about 
2.5x1 0^ pfu/ul. The amplified library is prepared essentially as above except the 
lambda GEM- 12 vector is used. After packaging, about 3.5X10 4 pfu are plated on 
5 the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and 
stored frozen in 7% dimethylsulfoxide. The phage titer is approximately IxlO 9 
pfu/ml. 

Liquid lysates (100 are prepared from randomly selected plaques (from 
the unamplified library) and template is prepared by long-range PCR using T7 and 

10 T3 vector-specific primers. 

Sequencing reactions are carried out on plasmid and/or PCR templates 
using the AB Catalyst LabStation with Applied Biosystems PRISM Ready 
Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (Ml 3-21) and 
the M 13 reverse (M13RP1) primers (Adams et a/.. Nature 368:474 (1994)). Dye 

15 terminator sequencing reactions are carried out on the lambda templates on a 
Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction 
Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence 
the ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. 

20 Sequencing reactions are performed by eight individuals using an average of 
fourteen AB 373 DN A Sequencers per day. All sequencing reactions are analyzed 
using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read 
distance. The overall sequencing success rate very approximately is about 85% for 
Ml 3-21 and M13RP1 sequences and 65% for dye-terminator reactions. The 

25 average usable read length is 485 bp for Ml 3-21 sequences, 445bp for MI3RP1 
sequences, and 375 bp for dye-terminator reactions. 

Richards et aL Chapter 28 in AUTOMATED DNA SEQUENCING AND 
ANALYSIS, M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, 
London, (1994) described the value of using sequence from both ends of 

30 sequencing templates to facilitate ordering of contigs in shotgun assembly projects 
of lambda and cosmid clones. We balance the desirability of both-end sequencing 
(including the reduced cost of lower total number of templates) against shorter 
read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer 
compared to the Ml 3-21 (forward) primer. Approximately one-half of the 

35 templates are sequenced from both ends. Random reverse sequencing reactions are 
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done based on successful forward sequencing reactions. Some M13RP1 
sequences are obtained in a semi-directed fashion: MI 3-21: sequences pointing 
outward at the ends of contigs are chosen for M13RPI sequencing in an effort to 
specifically order contigs. 

5 

4* Protocol for Automated Cycle Sequencing 

The sequencing is carried out using ABI Catalyst robots and AB 373 
Automated DNA Sequencers. The Catalyst robot is a publicly available 
sophisticated pipetting and temperature control robot which has been developed 

10 specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted 
templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the 
thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and 
reaction buffer. Reaction mixes and templates are combined in the wells of an 
aluminum 96-weil thermocycling plate. Thirty consecutive cycles of linear 

15 amplification (i.e.., one primer synthesis) steps are performed including 
denaturation, annealing of primer and template, and extension; i.e., DNA 
synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents 
evaporation without the need for an oil overlay. 

Two sequencing protocols are used: one for dye-labelled primers and a 

20 second for dye-labelled dideoxy chain terminators. The shotgun sequencing 
involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, 
permitting the four individual reactions to be combined into one lane of the 373 
DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently 

25 supplies pre-mixed reaction mixes in bulk packages containing all the necessary 
non-template reagents for sequencing. Sequencing can be done with both plasmid 
and PCR- generated templates with both dye-primers and dye- terminators with 
approximately equal fidelity, although plasmid templates generally give longer 
usable sequences. 

30 Thirty-two reactions are loaded per AB373 Sequencer each day, for a total 

of 960 samples. Electrophoresis is run overnight following the manufacturer's 
protocols, and the data is collected for twelve hours. Following electrophoresis 
and fluorescence detection, the ABI 373 performs automatic lane tracking and base- 
calling. The lane-tracking is confirmed visually. Each sequence electropherogram 

35 (or fluorescence lane trace) is inspected visually and assessed for quality. Trailing 
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sequences of low quality are removed and the sequence itself is loaded via software 
to a Sybase database (archived daily to 8nun tape). Leading vector polylinker 
sequence is removed automatically by a software program. Average edited lengths 
of sequences from the standard ABI 373 are around 400 bp and depend mostly on 
5 the quality of the template used for the sequencing reaction. ABI 373 Sequencers 
converted to Stretch Liners provide a longer electrophoresis path prior to 
fluorescence detection and increase the average number of usable bases to 500-600 
bp. 

10 INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing 
lab have been developed. (For review see, for instance, Kerlavage et a/.. 
Proceedings of the Twenty-Sixth Annual Hawaii international Conference on 

15 System Sciences. IEEE Computer Society Press, Washington D. C, 585 (1993)) 
The system used to collect and assemble the sequence data was developed using the 
Sybase relational database management system and was designed to automate data 
flow wherever possible and to reduce user error. The database stores and 
correlates all information collected during the entire operation from template 

20 preparation to final analysis of the genome. Because the raw output of the ABI 373 
Sequencers was based on a Macintosh platform and the data management system 
chosen was based on a Unix platform, it was necessary to design and implement a 
variety of multi- user, client-server applications which allow the raw data as well as 
analysis results to flow seamlessly into the database with a minimum of user effort. 

25 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and 
accurate assembly of thousands of sequence fragments was employed to generate 
contigs. The TIGR assembler simultaneously clusters and assembles fragments of 

30 the genome. In order to obtain the speed necessary to assemble more than I0 4 
fragments, the algorithm builds a hash table of 12 bp oligonucleotide subsequences 
to generate a list of potential sequence fragment overlaps. The number of potential 
overlaps for each fragment determines which fragments are likely to fall into 
repetitive elements. Beginning with a single seed sequence fragment, TIGR 

35 Assembler extends the .current contig by attempting to add the best matching 
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fragment based on oligonucleotide content. The contig and candidate fragment are 
aligned using a modified version of the Smith-Waterman algorithm which provides 
for optimal gapped alignments (Waterman, M S., Methods in Emymology 
164:165 (1988)). The contig is extended by the fragment only if strict criteria for 
5 the quality of the match are met. The match criteria. include the minimum length of 
overlap, the maximum length of an unmatched end, and the minimum percentage 
match. These criteria are automatically lowered by the algorithm in regions of 
minimal coverage' and raised in regions with a possible repetitive element. The 
number of potential overlaps for each fragment determines which fragments are 

10 likely to faJl into repetitive elements. Fragments representing the boundaries of 
repetitive elements and potentially chimeric fragments are often rejected based on 
partial mismatches at the ends of alignments and excluded from the current contig. 
TIGR Assembler is designed to take advantage of clone size information coupled 
with sequencing from both ends of each template. It enforces the constraint that 

15 sequence fragments from two ends of the same template point toward one another 
in the contig and are located within a certain range of base pairs (definable for each 
clone based on the known clone size range for a given library). 

The process resulted in 391 contigs as represented by SEQ ID NOs: 1-391. 

-0 3. Identifying Genes 

The predicted coding regions of the Streptococcus pneumoniae genome 
were initially defined with the program GeneMark. which finds ORFs using a 
probabilistic classification technique. The predicted coding region .sequences were 
used in searches against a database of all nucleotide sequences from GenBank 

25 (October, 1997). using the BLASTN search method to identify overlaps of 50 or 
more nucleotides with at least a 95% identity. Those ORFs with nucleotide 
sequence matches are shown in Table 1 . The ORFs without such matches were 
translated to protein sequences and compared to a non-redundant database of 
known proteins generated by combining the Swiss-prot, PIR and GenPept 

30 databases. ORFs that matched a database protein with BLASTP probability less 
than or equal to 0.01 are shown in Table 2. The table also lists assigned functions 
based on the closest match in the databases. ORFs that did not match protein or 
nucleotide sequences in the databases at these levels are shown in Table 3. 
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ILLUSTRATIVE APPLICATIONS 

1. Production of an Antibody to a Streptococcus pneumoniae 
Protein 

Substantially pure protein or polypeptide is isolated from the transfected or 
5 transformed cells using any one of the methods known in the art. The protein can 
also be produced in a recombinant prokaryotic expression system, such as E. coli, 
or can be chemically synthesized. Concentration of protein in the final preparation 
is adjusted, for example, by concentration on an Amicon filter device, to the level 
of a few micrograms/ml. Monoclonal or polyclonal antibody to the protein can 
10 then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes of any of the peptides identified and 
isolated as described can be prepared from murine hybridomas according to the 

15 classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or 
modifications of the methods thereof. Briefly, a mouse is repetitively inoculated 
with a few micrograms of the selected protein over a period of a few weeks. The 
mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
The spleen cells are fused by means of polyethylene glycol with mouse myeloma 

20 cells, and the excess unfused cells destroyed by growth of the system on selective 
media comprising aminopterin (HAT media). The successfully fused cells are 
diluted and aliquots of the dilution placed in wells of a microliter plate where 
growth of the culture is continued. Antibody-producing clones are identified by 
detection of antibody in the supernatant fluid of the wells by immunoassay 

25 procedures, such as ELISA, as originally described by Engvall, E., Meth. 
Enzymoi 70:419 (1980), and modified methods thereof. Selected positive clones 
can be expanded and their monoclonal antibody product harvested for use. Detailed 
procedures for monoclonal antibody production are described in Davis, L. ei at., 
Basic Methods in Molecular Biology, Elsevier. New York. Section 21-2 (1989). 

30 
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3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a 
single protein can be prepared by immunizing suitable animals with the expressed 
protein described above, which can be unmodified or modified to enhance 
5 immunogenicity. Effective polyclonal antibody production is affected by many 
factors related both to the antigen and the host species. For example, small 
molecules tend to be less immunogenic than others and may require the use of 
carriers and adjuvant. Also, host animals vary in response to site of inoculations 
and dose, with both inadequate or excessive doses of antigen resulting in low titer 
10 antisera. Small doses (ng level) of antigen administered at multiple intradermal 
sites appears to be most reliable. An effective immunization protocol for rabbits 
can be found in Vaitukaitis, J. et aL J. Clin. Endocrinol. Metab. 33:988-991 
(1971). 

Booster injections can be given at regular intervals, and antiserum harvested 

15 when antibody titer thereof, as determined semi-quantitatively, for example, by 
double immunodiffusion in agar against known concentrations of the antigen, 
begins to fall. See. for example. Ouchterlony, O. et ai. Chap. 19 in: Handbook of 
Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration 
of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). 

20 Affinity of the antisera for the antigen is determined by preparing competitive 
binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of 
Clinical Immunology, second edition. Rose and Friedman, eds., Amer. Soc. For 
Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in 

25 quantitative immunoassays which determine concentrations of antigen-bearing 
substances in biological samples; they are also used semi- quantitatively or 
qualitatively to identify the presence of antigen in a biological sample. In addition, 
antibodies are useful in various animal models of pneumococcal disease as a means 
of evaluating the protein used to make the antibody as a potential vaccine target or 

30 as a means of evaluating the antibody as a potential immunotherapeutic or 
immunoprophylactic reagent. 
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4. Preparation of PCR Primers and Amplification of DNA 

Various fragments of the Streptococcus pneumoniae genome, such as those 

of Tables 1-3 and SEQ ID NOS: 1-391 can be used, in accordance with the present 

invention, to prepare PCR primers for a variety of uses. The PCR primers are 

5 preferably at least 15 bases, and more preferably at least 18 bases in length. When 

selecting a primer sequence, it is preferred that the primer pairs have approximately 

the same G/C ratio, so that melting temperatures are approximately the same. The 

PCR primers and amplified DNA of this Example find use in the Examples that 
follow. 

10 

5. Gene expression from DNA Sequences Corresponding to 

ORFs 

A fragment of the Streptococcus pneumoniae genome provided in Tables I - 
3 is introduced into an expression vector using conventional technology. 

15 Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well 
known in the art. Commercially available vectors and expression systems are 
available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If 

20 desired, to enhance expression and facilitate proper protein folding, the codon 
context and codon pairing of the sequence may be optimized for the particular 
expression organism, as explained by Hatfield etaL. U. S. Patent No. 5,082.767, 
incorporated herein by this reference. 
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The following is provided as one exemplary method to generate 
polypeptide(s) from cloned ORFs of the Streptococcus pneumoniae genome 
fragment. Bacterial ORFs generally lack a poly A addition signal. The addition 
signal sequence can be added to the construct by, for example, splicing out the poly 
5 A addition sequence from pSG5 (Stratagene)-using Bgll and Sail restriction 
endonuclease enzymes and incorporating it into the mammalian expression vector 
pXTl (Stratagene) for use in eukaryotic expression systems. pXTl contains the 
LTRs and a portion of the gag gene of Moloney Murine Leukemia Virus. The 
positions of the LTRs in the construct allow efficient stable transfection. The 

10 vector includes the Herpes Simplex thymidine kinase promoter and the selectable 
neomycin gene. The Streptococcus pneumoniae DNA is obtained by PCR from the 
bacterial vector using oligonucleotide primers complementary to the Streptococcus 
pneumoniae DNA and containing restriction endonuclease sequences for PstI 
incorporated into the 5* primer and BglH at the 5* end of the corresponding 

15 Streptococcus pneumoniae DNA V primer, taking care to ensure that the 
Streptococcus pneumoniae DNA is positioned such that its followed with the poly 
A addition sequence. The purified fragment obtained from the resulting PCR 
reaction is digested with PstI, blunt ended with an exonuclcase, digested with 
BglH, purified and ligated to pXTl, now containing a poly A addition sequence 

20 and digested BgllL 

The ligated product is transfected into mouse NIH 3T3 cells using 
Lipofectin (Life Technologies, Inc., Grand Island, New York) under conditions 
outlined in the product specification. Positive transfectants are selected after 
growing the transfected cells in 600 ug/ml G4I8 (Sigma, St. Louis, Missouri). 

25 The protein is preferably released into the supernatant. However if the protein has 
membrane binding domains, the protein may additionally be retained within the cell 
or expression may be restricted to the cell surface. Since it may be necessary to 
purify and locate the transfected product, synthetic 15-mer peptides synthesized 
from the predicted Streptococcus pneumoniae DNA sequence are injected into mice 

30 to generate antibody to the polypeptide encoded by the Streptococcus pneumoniae 
DNA. 
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Alternatively and if antibody production 4s-Rot possible, the Streptococcus 
pneumoniae DNA sequence is additionally incorporated into eukaryotic expression 
vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease 
5 cleavage sites are engineered between the giobin moiety and the polypeptide 
encoded by the Streptococcus pneumoniae DNA so that the latter may be freed 
from the formed by simple protease digestion. One useful expression vector for 
generating globin chimerics is pSG5 (Stratagene). This vector encodes a rabbit 
globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 

10 transcript, and the polyadenylation signal incorporated into the construct increases 
the level of expression. These techniques are well known to those skilled in the an 
of molecular biology. Standard methods are published in methods texts such as 
Davis et al. cited elsewhere herein, and many of the methods are available from the 
technical assistance representatives from Stratagene, Life Technologies, Inc., or 

15 Promega. Polypeptides of the invention also may be produced using in vitro 
translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes 
of clarity and understanding, one skilled in the art will appreciate that various 
changes in form and detail can be made without departing from the true scope of 

20 the invention. 

All patents, patent applications and publications referred to above are 
hereby incorporated by reference. 
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(1) GENERAL INFORMATION: 

(i> APPLICANT: Charles Kunsch 

Gil H . Choi 
Patrick S. Dillon 
Craig A. Rosen 
Sceven C. Barash 
Michael R. Fannon 
Brian A. Dougherty 

(iij TITLE OF INVENTION: Streptococcus pneumoniae Polynucleotides and Sequences 

(iii) NUMBER OF SEQUENCES : 391 

(iv) CORRESPONDENCE ADDRESS: 

(A> ADDRESSEE: Human Genome Sciences. Inc. 

(B) STREET: 9410 Key Wesc Avenue 

(C) CITY: Rockville 

( D) STATE: Maryland 

(E) COUNTRY: USA 
tF) ZIP: 20850 

(v) COMPUTER READABLE FORM : 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/3 3 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 



WO 98/18931 PCT/US97/19588 



149 

(A) APPLICATION NUMBER : 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY / AGENT INFORMATION : 
(A) NAME: Brookes. A. Anders 

{ 3 ) REGISTRATION NUMBER: 36,373 

tC) REFERENCE / DOCKET NUMBER: PB340P1 



<vi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8512 
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(2) INFORMATION FOR SEQ ID NO: 1: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5625 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CCAAGCAAAA CCAGCTACAG CTAAAGGAAC TTACGTAACA AACTTGACTA TCACAACTAC 60 

TCAAGGTGTT GGTATCAAAG TTGACGTAAA CTCACTTTAA TCAGTAGTTA AAGTAATGTA 120 

AAAAAGTTGA AGACGCTATG TCTCAACTTT TTTTGATGTA CGACGGGCAT GTTGTATAGT 180 

AGATGTGTAC TATTCTAGTT TCAATCTACT ATAGTAGCTC AGAAGTCGGT ACTTAAACGT 240 

GCTATATCAA AACCAGTCCT TGAAAAACGT GGACTGGTTT CGTGTTTGGA TTATTACCTT 300 

GAACGACATG CGTTAAAAGT TAGTTGAACC GCCGTATGCC GAACGGACGT ACGGTGGTGT 360 

GAGAGGGGCT AGAGATTATC CCCTACTCGA TTTCGAAATC TAGTGGAATG AATCTGGAAT 420 

AGTCCATCGA GCTTTCTAAT ACTCTTCGAA AATCTCTTCA AACCACGTCA ACGTCGCCTT 480 

GCCGTGCGTA TGGTTACTGA CTTCGTCAGT TCTATCCACA ACCTCAAAAC AGTGTTTTGA 540 

GCTGACTACG TCAGTTCCAT CTACAACCTC AAAACAGTGT TTTGAGCAAC CTGCGGCTAG 600 

TTTCCTAGTT TGCTCTTTGG TTTTCATTGA GTATAACACA TTGTTAGAAG TTGGTTTAAA 660 

TTTCCTAATC A G TTTGTTCA CATTTACCTT CGATATATTA TATCCCATAG TTAAGGTTGG 720 

TCATACAGAT GATTATAGTC ATGGAGCCGT AAAACTTAGT GTTTCTTTAG TTGACAAAGA 780 

TGCCATGAAA AAAATATTTG TAACTGTAAT AGGATATTTT GAAATAAATA TAGATGAAAA 840 

TATCACCGAT ATTCTATACG TAAATGGTAC TGCTATTCTT TATCTTTATT TACGTTCAAT 900 

TGTTTCAATA GTTTCGGCAA TTGATAGCAG TGAAGCAATG TTGCTACCTA TCATTAATGT 960 

TTTAGAGTTA CTAGATAAAT CTCAACCTTT TGAAGAAGAA TAATTTATTA GCTCACTAAA 1020 

TTGAGGGTAA GGAAAAGTAA AAGCAGTAAG AAAAATGTCT TGCATTATAC AGCAACCTTT 1080 

TGCGAATGAG TGGATGGATT GAATAAAATT TGATTAAGAG TGGATGATTT ATCTGTAGAT 1140 

TATTATTGGA CAGTTAGTCT TGAAGTAGTC TAAGAATTAG GTTATAATCA CTAGAAGCCT 1200 

TGCTAATAAT GAGGAGGTTA GTTTATGTAT AGTAGACTGA ATCTAAAATA GTACGAAACA 1260 

ATTGCTAAAA CATTTATAGA AATTAATTTT ACTTTCCCAA TCGATTTGTT CTCATCTTAT 1320 

TTCAATCCGC TATATATTAT GGTATCGAAT CTTCATCAGA ATGATAAAAT TAATCAATTG 1380 

ATATCTGATT ACAAACAGAA TATGAAAGCT TTTTATATCA CTATTGAAAA ATTTATACGA 1440 
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GATGATGAAA 


GCCTTAAGTG 


TTATTTTATA 


AAGGTTATTT 


CAAGTCCTTC 


CAAGGTAACA 


1500 


AGTCTAGATC 


AGATTGAAGC 


TGATAAAACG 


ATACAAAGAA 


AATATTCAAG 


TGAGCTAAAA 


1560 


AAATTTATTG 


GATTTTATAA 


TGAGATTATT 


TGTGAGGAAA 


ATAGTTTCCT 


ACATGTACGA 


1620 


AAGAGGTGGT 


CGAGTTGGTT 


TAGGTAGTCG 


ATGCGTGAGT 


TGATAATTCT 


CAGGGTATGG 


1680 


ACTTCTTTTT 


CATGAATGAG 


GTAAAAGAGC 


AGGTATTGTT 


TAGAGACAAT 


CATTCTGAGC 


1740 


ATATTTTCTG 


GATAGAGGGA 


GTATCCGATT 


TTATGATCAA 


AGTTAATACC 


GCCCTCTGGT 


1800 


GAGAAGATGA 


GTAGGTTGGT 


AATTTAAACT 


ATTAAACAGA 


ATTTTTGATT 


AAAAGTATTA 


1860 


TTTCATGAGA 


GAAATCCTAA 


TTTCACAATC 


CATAGGCAAA 


CGCTTGCATT 


TCGTTTTTTA 


1920 


TTGGACTATA 


ATAGGTTGGT 


ATAAAGCCTT 


CTGTAGTAAT 


AAAATGTAGA 


AGGTGTAGAA 


I960 


AGTAAGGATT 


TAGAATATTT 


GTAGTTAAAA 


ACACAATGTT 


GCTATTCCTT 


ACGATAGGGA 


2040 


GATAGATATG 


GCAATGATAG 


AAGTGGAACA TCTTCAGAAA 


AATTTTGTGA 


AGACTGTTAA 


2100 


GGAACCGGGC 


TTGAAGGGGG 


CTTTGCGCTC 


CTTTATTCAT 


CCTGAAAAGC 


AGACCTTTGA 


2160 


AGCGGTCAAG 


GATTTGACCT 


TTGAGGTTCC 


AAAAGGGCAG 


ATTTTAGGAT 


TTATCGGGGC 


2220 


AAATGCTGCT 


GGGAAGTCGA 


CAACCATTAA 


AATGCTGACA 


GGAATTTTGA 


AACCAACATC 


2260 


TGGTTTTTGT 


CGGATTAACG 


GCAAGATTCC 


CCAGGACAAT 


CGGCAAGATT 


ATGTCAAAGA 


2340 


TATTGGCGTA 


GTCTTTGGAC 


AACGCACCCA 


GCTATGGTGG 


GATTTGGCTC 


TGCAAGAGAC 


2400 


CTACACTGTC 


TTAAAAGAGA 


TTTATGATGT 


GCCAGACTCG 


CTCTTTCATA 


AGCCTATGGA 


2460 


CTTTTTGAAT 


GAAGTCTTGG 


ATTTGAAGGA 


CTTTATCAAG 


GATCCCGTGC 


GGACTCTTTC 


2520 


ACTGGGACAA 


CGCATGCGGG 


CGGATATTGC 


GGCCTCCTTG 


CTCCACAATC 


CCAAGGTTCT 


2580 


TTTTTTAGAT 


GAGCCGACCA 


TTGGTTTGGA 


CGTTTCGGTT 


AAGGATAATA 


TTCGTCGGGC 


2640 


AATTACTCAG 


ATCAATCAAG 


AGGAAGAAAC 


TACCATTCTT 


TTGACCACTC 


ACGATTTGAG 


2700 


TGATATTGAG 


CAACTTTGTG 


ATCGGATTTT 


CATGATTGAC 


AAGGGGCAAG 


AGATTTTTGA 


2760 


TGGAACGGTG 


AGCCAACTCA AGGAGACCTT 


TGGTAAGATG 


AAGACTCTCT 


CTTTTGAACT 


2820 


GCTACCAGGT 


CAAAGTCATC 


TCGTCTCTCA 


CTATGACGGT 


CTGTCTGATA 


TGACCATTGA 


2880 


TAGACAACGA 


AACAGCCTCA 


ACATTGAATT 


TGATAGTTCT 


CGCTACCAGT 


CAGCTGACAT 


2940 


TATCAAGCAA 


ACCCTGTCTG 


ATTTTGAAAT 


CCGCGATTTG 


AAGATGGTGG 


ATACGGATAT 


3000 


TGAGGATATT 


ATCCGTCGCT 


TCTACCGAAA 


GGAGCTCTAG 


GATGATCAAA 


TTGTGGAGAC 


3060 


GTTATAAACC 


CTTTATCAAT 


GCAGGGGTTC 


AGGAGTTGAT 


TACTTACCGA 


GTCAACTTTA 


3120 


TTCTCTATCG 


GATTGGCGAT 


GTCATGGGGG 


CTTTTGTGGC 


CTTTTATCTC 


TGGAAGGCTG 


3180 
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TCTTTGATTC 


TTCGCAAGAG 


TCTTTGATTC 

4 4 m 4 ■ 4 4 


AGGGCTTCAG 


TATGGCGGAT 


ATCACCCTCT 


3240 


ACATCATCAT 


GAGTTTTGTG 


ACCAATCTTC 


TGACTAGATC 


CGATTCGTCC 


TTTATGATTG 


3300 


GGGAGGAGGT 


CAAGGATGGC 


TCCATTATCA 


TGCGTTTGTT 


GCGACCAGTG 


CATTTTGCGG 


3360 


CCTCCTATCT 


TTTCACCGAG 


CTTGGTTCCA 


AGTGGTTGAT 


TTTTATCAGC 


GTTGGCCTTC 


3420 


CATTTTTAAG 


TGTCATTGTC 


TTGATGAAAA 


TCATATCGGG 


TCAAGGTATT 


GTAGAGGTGC 


3480 


TAGGATTAAC 


TGTCATTTAT 

ft \* A V*»* AAA ■ * A 


CTTTTTAGCT 


TAACGCTCGC 


CTATCTGATT 


AACTTTTTCT 


3540 


TTAATATTTG 

* * #W» J 4» A A ft V 


CTTTGGATTT 

Xv» AAA A ft A 


TCAGCCTTTG 


TGTTTAAAAA 


TCTTTGGGGT 


TCCAACCTAC 


3600 


TTAAGACTTC 


CATAGTGGCT 


TTTATGTCGG 

AAA ** A A K*^* 


GGAGTTTGAT 


TCCCTTGGCA 


TTTTTTCCAA 


3660 


nVAJ 4 iVJ 1 X A \* 


AflATATTCTC 


TCCTTTTTGC 

A 4 A A A A 


CTTTTTCATC 


CTTGATTTAT 


ACTCCACTTA 


3720 




TGGAAAATAC 


GATGCCAGTC 

\#aa A UbvAW A 


AGATTCTTCA 


GGCACTCCTT 


TTGCAGTTCT 


3780 


1 V« 1 \J^JV^ A V* L * 




GGATTGTCTC 

UMTt* 4 A X* ft W 


AGTTAATTTG 


GAAACGGGTC 


CAGTCCTTTA 


3840 




aggaggttag 


TATGAAAAAA 


TATCAACGAA 


TGCATCTGAT 


TTTTATCAGA 


3900 




AAPAAATCAT 


GGAATATAAG 


GTAGATTTTG 


TGGTTGGTGT 


CTTGGGAGTC 


3960 


X 4 1 V. 1 1 *w 


AAGGf*TTGAA 


1 V 1 w 1 1 \J A A I 


CTCAATGTCA 


TCTTTCAACA 


TATTCCATTC 


4020 




fin ArrTTTf a 


AGAOATAGCT 


TTCATTTATG 

A ft W* AAA *• A w 


GATTTTCCTT 

Ufl A A A A ^* 4 A 


GATTCCCAAG 


4080 




f\ 1 N* 1 \« 1 X I X 1 


TCACAATCTC 


TGGGCACTAG 


GGCAACGCCT 


AGTCCGAAAA 


4140 




ftp AAfJTATfT 


GACTCGTCCC 


ATCAATCCTC 

■ A A ^v^4vb A A ^p> 


TCTTTCACAT 


CCTAGTTGAA 


4200 


APPTTTPAflA 


X 1 W/l A VlVrW X X 


GGGTGAACTC 


TTAGTCGGTG 

A A m\\^ A V* A 


GTATTTTATT 

4 ■ * 4 4 44P«4 * 


GGGAACAACA 


4260 


G^ACCAGCA 


TTG* T *TTGGAC 


TCTTCCAAAA 

ft w ft ft wworwiri 


TTCCTGCTTT 


TCCTAGTTTG 


TATTCCTTTT 


4320 


GCGACCTTGA 


AAA /\ A 4 A \* 


TCTTAAAATC 

A ^ a A onnn a ^ 


GCAACAGCCA 


GTATCGCCTT 


TTGGACTAAG 


4390 


PAGTCAGGCG 


CPATGATTTA 


ATP* P , I*CT AT 

^nA*% A ^ A A W A «» A 


ATGTTCAATG 

n a ^« a a ^*>nr* a 


ACTTTGCTAA 


GTATCCGATT 


4440 


TCTATTTACA 


^> A A W 4 1 A W * 


TCGTTGGTTG 

A N»\# 4 A V^N* ft * 


ATTAGCTTTA 


TCGTGCCTTT 


CGCCTTTACA 


4500 


GCCTACTATC 


CAGCTAGCTA 


TTTCTTACAG 


GAAAAGGATG 


TGTTCTTTAA 


CGTAGGAGGT 


4560 


TTGATGTTGA 


TTTCTCTGGT 


TTTCTTTGTT 


ATTTCCCTTA 


AACTTTGGGA 


TAAGGGCTTA 


4620 


GATTCCTACG 


AAAGTGCGGG 


TTCGTAAAAG 


CTAAAGTAAG 


ACTAAAATCA 


AGAAAGAAAC 


4680 


TTATGATGTT 


TGTAATTGAA 


GAAGTCAAGG 


ATGAAAATCA 


AAAAAAGGCA 


GTTGTCGCTG 


4740 


AGGTTTTGAA 


GCATTTGCCA 


GAATGGTTTG 


GAATCCCAGA 


AAGCACACAA 


GCCTATATAG 


4800 


AAGGAACCAC 


GACACTGCAA 


GTTTGGACCG 


CCTATCAGGA 


GAGTGATTTG 


ACTAGATTTG 


4860 


TAAGCTTATC 


CTATTCGAGT 


GAAGATTGTG 


CAGAGATTGA 


TTGTCTCGGC 


GTAAAAAAGC 


4920 


TTATCAAGGT 


AGAAAAATTG 


GGAGCCAATT 


GCTTGCTACT 


TTAGAGAGTG 


AAGCTCGTAA 


4980 



I 
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AAAAGTTGCT TATCTGCAGG TCAAAACAGT GGCAGAAGGT TCTAATAAAG ATTATGATCG 504 0 

AACAAATGAC TTTTATCGAG GTCTTGGCTT TAAAAAGTTA GAGATTTTTC CTCAACTATG 5100 

GAATCCGCAA AATCCTTGTC AGATTTTGAT TAAAAAGCTT GAATAATATT ACTTGACATC 5160 

TATTCTCAGA GTGCTATACT GTAAGTGTAA TCGCCGATTT AGCTTAGTTG GTAGAGCAAG 5220 

GCACTCGTAA AGCCTAGGTT ATAGGTAGAT AAACGACTGA GGATTTGAAA AAATAGATAG 5280 

GTAGAAGATA ACCCTTAAGC CTTACTCTTA GCGGTTATTT ATATTGTTTA ATAGCGCTAA 5340 

TATTTTATCA ATTATGCCTG TTTTCGTGTT TCTGGTAGTT GTTCAAGTTT ATTGCTACTA 5400 

TTTTTGATGG TATGAATGTG CTTATAATGT ATCCCGGTTA ACGAAAGTTT TGGACTTATA 5460 

CTCTTCGAAA ATCTCTTCAA ACCACGTCAA CGTCGCCTTG CCGTGCGTAT GGTTATGACT 5520 

TCGTCAGTTC TATCCACAAC CTCAAAACAG TGTTTTGAGT GACTACGTCA GTTCCATCTA 5580 

CAACCTCAAA ACACTGTTTT GCCCAATCTG CGGCTAGTTT CCTAG 5625 
(2) INFORMATION FOR SEQ ID NO: 2: 

ti) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7571 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CTCTCCAGCT TTCCTTGCGA GTTGGCCATG TTGTGTCTTT AAGAAGTCTA AAAATATCTC 60 

CAATAAAACG CATCGCTCTC TCCTATCTCG TTTCTCTGTG TGTAGTGTAC TTGCCACAAT 120 

GCTTACAAAA TTTATTTACT TCTAGTCGTG TAGGCTTGAG GTTTCCGCTG ATCTTGATTG 180 

AATAGTTTCT CGAACCACAA ACCGCACAAG CTAGGCTTGC TTTTTTTAGT GCCATAACGC 2 40 

CTCCATCTTA TCCATTATAA CAAGAAAGCT AGGCTTTGAC AAGCATCTTA GCGAAATAGA 300 

TTGACTATCG AATCCCATAT TGTTTGAGCC TTTTCCTTAA TCTTCGCATC TGAGATAGCC 360 

CGGCTAGCCT CATCTACTAG ACTTTGCGCA CGCCCTCGAA TATCAGACAA ATTATCATCT 420 

GTCTGGCTAT TATCATTGGT TTGTACTTGT CTTTTTGTAT TGGCTGGTGC AATTCCATTT 4 80 

TGCTTATAAG CATTTTCAAC CGTAAAGGTA CTTCCTGGCG TATAAGGTAA AATGGTATTG 540 

GCAATGTTTC TAAAGACATG AGCTGCACCG TTTGAAGTAG AGCCAGCTAG ATAGTGGTTT 600 

TCATCAGTGG TCGGAAAGCC AAGCCAGTGG CTAATCACTA CATCCGGAGT ATAACCAATT 660 

ACCCACTGGT CACTTGTGTA CTCCGGATTG AAAACTGCTT CACTTGTTCC AGTTTTCCCT 720 
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GCCATGACAT 


AGTCTGCAGG 


CGATGAACTA 


ATACCGGTAC 


CGTTGGTGAA 


AGTCCCCAAC 


780 


ATCATACTGG 


TCATCTTGTC 


AGCTACAGAC 


TTATCAATCA 


CCCGTTTTTG 


TGAATTTTTA 


840 


TGACTCGCAA 


TAACTTGTCC 


ACTAGCATTT TCAATTCTAC 


TAATAAAATG 


AGCTTCAGGC 


900 


ATTAAACCTT 


CATTTGCAAA 


GGCGGCGTAT 


GCTTGAGCCA 


TTTGAAGAGG 


GTTGGTTTCA 


960 


ACACCGCTTC 


CCAAGGCGAC 


ACCAAGAACA 


CGGTCGACCT 


TTTCCATGTT 


GAGTCCGAAT 


1020 


TTTTCGCCTG 


CCTCAAAACC 


CTTGTCGACA 


CCCAAATCAT 


TAACAGTGCC 


AACAGCAGGT 


1080 


AGATTAAGCG 


ATTCTGCCAA 


GGCTTGATAC 


ATAGGAACTT 


CTCGACTCGT 


TTTGATCCCT 


1140 


GCATAGTTAT 


CAACCTTATA 


GCTGTCATAC 


TGCATGGTAT 


GGTTATCCAA 


CTGCTTATTC 


1200 


AAAGCCCAGC 


TTGCTTCAAC 


TGCTGGCGTA 


TAAACAACTA 


AAGGCTTAAT 


TGTAGAACCA 


1260 


GGACTACGCT 


TTGATTGGGT 


TGCATAGTTG 


AAATTCCGGA 


ATCCAGTTTT 


ATCATTGTCA 


1320 


GCAACTTGAC 


CGACAACTCC 


ACGAACTCCC 


CCTGTTTTCG 


GTTCGAGGGC 


TACACTTCCT 


13BO 


GATTGAGCAA 


ACGTTCCATC 


CTCTGCCCTC 


GGAAATAGCG 


ATGTGTTTTC 


ATAAACAATC 


1440 


TGCATATTTG 


CTTGGTAGTT 


TTGGTCCAGC 


TCTGTGTAAA 


TGCGGTAGCC 


ATTATTGACA 


1500 


ATCTCTTCCT 


CTGTTAGATT 


ATACTTGGAA 


ACACCTTCAT 


TAACCACCGC 


ATCAAAATAA 


1560 


GAGGGGTAAC 


GGTAATCTGA 


GATTTTTCCT 


TCATACTTAT 


CGTGCAATTG 


CGAAGTCATA 


1620 


TCAACTTCAG 


CAGCTTTGGT 


TTCTTGGTTT 


TTATCAATAT 


ATCCTGCTGC 


AACCATATTC 


1680 


TGCAAGACAG 


TATCGCGCCG 


ATTAGTAGAA 


TCTTCTACGG 


AATTCAAGGG 


ATTATACAGT 


1740 


TCCGGCCCCT 


TGAGCATCCC 


TGCCAGAGTC 


GCAGCTTGAT 


CCAGACTCAC 


TTCTGATGCA 


180O 


GAAACTCCAA 


AGTATTTCTT 


ACTCGCATCT 


TCTACACCCC 


ACACACCATT 


TCCAAAATAA 


1860 


GCGTTGTTAA 


GGTACATGGT 


TAGAATTTGC 


TCCTTACTAT 


ATTTTTTGCT 


TAATTCTAAG 


1920 


CCAAGGAAAA 


ATTCTTTCGC 


TTTTCTCTCA 


ACAGTTTGAT 


CCTGCGATAA 


ATAGGCGTTT 


1980 


TTAGCCAGCT 


GTTGGGTAAT 


GGTAGAGCCA 


CCACCTGAAC 


GTCCAGCAGT 


GACAATAGCC 


2040 


& h K fc ft K 

AAbAAAAAAL 


GGCCATAGTT 


AATCCCGTCA 


TTTTTATAGA 


AAGAACGGTC 


TTCTGTCGCA 


2100 


ATAACAGCAT 


TCTGCAAGTT 


TTTACTGATG 


TCAGTCAGCT 


CAACATAGGT 


TCCCTTTTGA 


2160 


CCAGACAAGG 


CACCAGCCTC 


TTTTTCTTCA 


CGGTCAAAAA 


TAAGAGTCCG 


AGTTTTCAAG 


2220 


GCATTTTCCA 


AATCATTGAC 


ATTGGTCGAC 


TTGCCTACAG 


CAAACAAATA 


GATTCCAACT 


2280 


AGCAAGCCTG 


CACTCAAACC 


TAGTATAAGG 


ATAATCTTTG 


TTAGATGATA 


ACGACGCCAG 


2340 


AATTTTCGAA 


TCGGACCTAC 


TTGGGCTAAT 


TTTTTTCGAT 


CACTACGAGA 


GCGACGTAAG 


2400 


ATAGTAGAAT 


CAGAGTCCTC 


TAGTTCACTT 


GTTTCTTTTT 


TAAAAAGAGA 


AAGAAATTTC 


2460 


TCAAATAATT 


TATCTAATTT 


CATGCGTTTA 


TTTTATCATC 


TTCATCATAG 


GAAGACAAGA 


2520 
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ATTTAGCTAT TTCCTATCCA AATAGGGCTT TTTTTGTTAC AATATCTGTA TGCAATTCAC 2580 

ATTTACATTA CCCGCCTCTC TACCTCAAAT GACAGTAAAG CAATTACTTG AGGAACAACT 2640 

CCTCATCCCT AGAAAAATCC GTCATTTTTT GAGAATCAAG AAACATATTT TGATAAATCA 2700 

AGAAGAAGTC CACTGGAAGG AAATCGTAAA TCCTGGAGAT GTTTGCCAGT TGACTTTTGA 2760 

CGAGGAAGAT TATTCCCAAA AGACGATCCC TTGGGGCAAC CCAGACTTAG TGCAGGAAGT 2820 

TTATCAAGAT CAACACTTGA TTATTGTAAA CAAACCAGAG GGGATGAAAA CGCATGGTAA 2880 

TCAACCAAAC GAAATTGCCC TTCTTAACCA TGTCAGTACC TATGTTGGCC AAACCTGCTA 2940 

TGTCGTTCAT CGTCTGGACA TGGAAACCAG TGGCTTAGTT CTCTTTGCCA AAAATCCTTT 3000 

TATCCTGCCC ATTCTCAATC GCTTATTGGA GAAAAAAGAG ATTTCTAGAG AATATTGGGC 3060 

TCTAGTTGAT GGAAATATCA ACAGAAAAGA ACTTGTTTTC AGAGACAAAA TTGGACGTGA 3120 

TCGCCATGAT CGTAGAAAAA GAATAGTTGA TGCAAAAAAT GGGCAATATG CTGAAACGCA 3180 

TGTAAGCAGA TTAAAGCAAT TCTCAAACAA GACTTCCTTG GCTCATTGCA AGCTAAAGAC 3240 

AGGGCGAACC CATCAGATTC GTGTGCACCT TTCGCATCAT AATCTTCCTA TCCTGGGAGA 3300 

CCCTCTCTAT AATACTAAAT CAAAGACAAG CCGGCTTATC CTTCATGCCT TCCGACTTTC 3 3 60 

CTTTACCCAC CCACTTACTT TAGAGAAGCT AACTTTCACT ACCCTTTCAA ATACATTTGA 3420 

AAAAGAATTA AAAAAGAATG GATGATCGTG TCATCCATTT TTCCATATAA AAAAGCAAGA 3480 

CCACAAAGCC TTGCTTTCTA TCAACTCAAG AATTATTTAG CAATTTTTGC GAAGTATTCA 3540 

AGAGTACGAA CAAGTTGTGC AGTGTATGAC ATTTCGTTGT CCTACCATGA TACAACTTTA 3600 

ACCAATTGTT TACCGTCAAC GTCAAGAACT TTAGTTTGAG TTGCGTCAAA CAATGAACCG 3660 

TAAGACATAC CTACGATATC TGAAGATACG ATTGGATCTT CTGTGTAACC GTATGATTCG 3720 

TTTGAAGCTG CTTTCATAGC TGCGTTCACT TCATCAACAG TAACGTTCTT TTCAAGAACT 3780 

GCTACCAATT CAGTAACTGA TCCAGTTGGA GTTGGAACGC GTTGTGCAGA TCCCTCAAGT 3840 

TTACCATTCA ATTCTGGGAT TACAAGACCG ATAGCTTTTG CAGCACCAGT TGAGTTAGGA 3900 

ACGATGTTTG CAGCACCAGC GCGAGCACGG CGAAGGTCAC CACCACGGTG TGGTCCGTCA 3960 

AGGATCATTT GGTCACCAGT GTAAGCGTGG ATAGTAGTCA TCAATCCTTC AACAACACCA 4020 

AAGTTGTCTT GAAGAGCTTT AGCCATTGGA GCCAAGCAGT TTGTAGTACA TGAAGCACCT 4080 

GAGATAACTG TTTCAGTACC GTCAAGAACG TCGTGGTTAG TGTTGAATAC AACTGTTTTA 4140 

ACGTCGTTTC CACCAGGAGC AGTGATAACA ACTTTT TT AG CTCCACCTTT AAGGTGTTTT 4200 

TCAGCTGCTT CTTTCTTAGC AAAGAAACCA GTAGCTTCAA GAACGATTTC TACACCGTCA 42 60 



WO 98A8931 



PCT/US97/19588 



156 

GTAGCCCAGT CGATTTGTTC TGGATCACGT TCAGCAGAAA CTWSWGAA TTTACCGTTA 4320 

ACTTCAAATC CACCTTCTTT AACTTCAACA GTACCGTCGA AACGACCTTG AGTTGTGTCG 4 380 

TATTTCAACA AGTGTGCAAG CATAACTGGA TCTGTAAGGT CGTTGATGCG TGTAACTTCA 4440 

ACACCTTCTA CGTTTTGGAT ACGACGGAAA GCAAGACGAC CGATACGTCC GAAACCGTTA 4500 

ATACCAACTT TAACTACCAT TAGTGATTTC CTCCTTATGA AAATCATGAA ATTTTTATTG 4560 

TGAAAAGAGT AACTTGAATC ACTACAAATC ACCTTTCAAC AAACCTATTA TACAACTATT 4 620 

TGAGTTGAAT TGCAAGTATG GCCATTGTTT TTCTATGTTA GTTTCTTTTT AAGACTGTAA 4680 

ACCAAGGAAT CCCTTACTAT TCATAGCATA ACGATTCTAT AGGATCCATT TTACTAATCT 4740 

TACGCGCCGG GAAGTAGGCT GAGACATAAC CAAGTAATAG AGCGAAAACT AGAGTTCCTA 4 800 

AAACAGATAA AAGATTTAAT TTAAAAACCT TAGTGATGGA TGGGTAAAAG TGACTTACAA 4860 

TCGCATTCGC CAAACTTCCC ACCCCTTGTG CAACCAAAAA TGCCAGCAGC AAGGCGATGC 4920 

CTACAATCCA GATAGCCTCG TAAATAAAAA TTCCTTTGAC ATCACGATTC TGATAACCAA 4980 

CTGCTTTCAT GACACCTATT TCCTTGGAAC GTTGCATGAT ATTGATGTAA ATAATGATAC 5040 

CAATCATAAC CGCTGCTACC ACAATAGCTT GTGATGAAAG CACAATCAAT AATCCCTGAA 5100 

TAACACGAAT AAAGGTAATC ACAATATCAA GAACTCTCTG TTGAGAAAGC ACAGTATACT 5160 

TCTTATTTTT CTGTAATTCT TCTGTTACTA CTTTTGTCTG TGATGGATCT TTGAGTTCCA 5220 

AGATAAAATA AGATACAGCT TTCGTAAATC CAGCCTCTTT CAAAATCGTT TCCATTTGAT 52 80 

GAGACAGCAT GAAACTGTTG CTGTCCTCCA TGTCATCTTC ATCATTGATT ACACGTACAA 5340 

TCTTCGTTTG AAATTGAGCA ATCTTACTAG TTTCGGCAGC ACTTTCTACA ATGCTGGCTG 5400 

AGACTGATTT GCCAATAAGA TCATTAGCTG TCAAATTTTT TCCTGTCTGT TCATTCCAAT 5460 

TTTTTAGTAA ACTGCTTGGA ATCGTTAATC CCTGTTCATT TGTATCAGTA TAGAGGGATC 5520 

CAGCCAACAC TTTGTCCGTC TCATTATTAC TAACAGAGAT ACTTGTATCA TCATAAAGAC 5580 

TCACTACTTG AGCATAAGAA GGCATCGTTT GACTCAGATC CATTTCTTGC CCATCTATAG 5640 

TAATATTTGA CATGTTCATC CCAAAAGGAC TCTCCAAATA TTTAATAGCT TCTTTCCCAA 5700 

CTGTATCCGT GATATATAGT CAATTGAAAC AAGAGCAGGA TAAAAAAGCC TCGTAAAAGG 57 60 

TATTGCAACT TGGTAATACC TTTTTGAGGT GCTTTTTGAT ATGAGCCCAT GTTTTCTCAA 5820 

TAGGATTGTA CTCAGGCGAG TAGGGAGGAA GAGGTAAAAG TTTATGCCCA AACTCTTCGC 5880 

ATAAAAGTTC TAGCTTCCCC ATTCTATGGA ATCTTACATT ATCCATAATA ATAACCGATG 5940 

GTGTGTTTAA TGTTGGTAAG AGAAAATTCT GAAACCAAGC TTCAAAAAAC TCGCTCGTCA 6000 

TCGTCTCTTC GTAAGTCATT GGAGCGATTA ATTCACCATT TGTTAGACCT GCAACCAAAG 6060 
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AAATCCTCTG ATATCTTCTT CCAGATACTT TGCCTCTTAT TAATTGACCT TTTAATGAGC 6120 

GACCATATTC TCGATAAAAA TAAGTATCGA ATCCTGTTTC GTCAATCTAA ACAGGTGCTA 6180 

GGTGCTTTAA ACTATTAAAA TTCTTAAGAA ATAAGGCTAC TTTTTCTGGG TCTTGTTCAT 6240 

AGTAGGTGTG GTTCTTTTTT CGAGTGTAGC CCATAGCTTT GAGCGTATAG TGGATGGTAG 6300 

TTGGATGACA GCCAAATTCA GAAGCTATTT CAGTCAAATA AGCGTCTGGA TTGTCAGTAA 63 60 

GATAGTTTTT AAGTCTATCT CTATCAACCT TTCTTGGTTT TATTCCTTTT ACTTGGTGGT 6420 

TTAGCTCTCC TGTTTTCTCT TTTAGCTTTA ACCAGCCATA AATGGTATTA CGTGAGATTT 6480 

GGAAAACGTG TGATGCTTCT GTTATACTAC CTGTTCGCTC ACAATAAGAG AGAACTTTTT 6540 

TACGAAAATC TATTGAATAT GCCATAAAAA GATTATACCA CATTGTGTAC TATTTTTGGT 6600 

TCATTTTACT ATATTTGAAG AGGCGTTTAA ACTATCTGAC ATAAAACTCG TTCTAGAGGA 6660 

AAGACATCCT TTAAAAAGTT AGTTTATTTT ACAACTTAGA CATCAAGGTA GGTTAACCCC 6720 

TTCATGGAAA AATCAAGACT CTTAGCACTA TGGGTTAAAC TACCACTGGA GACGT AATC A 6780 

ATCGCTAAAC CACGAAAACG GCTAATAGTG GTCATATCAA TATTTCCAGA ACATTCAATC 6840 

CGAGAACGTC CTGCAATTAG GGTAATGGCC TGTTCAATCT GTTCCAATGA CATATTATCC 6900 

AACATGATAA TATCAGCACC CGCCGCCGCA GCTTCTTCGG CAGCAGCAAG GCTTTCCACT 6960 

TCCACCTCGA CCATTTTCAC AAAAGGGGCA TAGGCACGCG CTTGAGCAAT TGCCTTTTGA 7020 

ACACTACCTA CTGCCGCAAT GTGATTGTCT TTTAGCAGGA TAGCATCTGA TAAATTAAAG 7080 

CGATGATTAT AGCCACCGCC AACTCTCACG GCATATTTCT CAAAAAGACG TAAATTAGGA 7140 

GTAGTTTTTC GAGTATCAAA TACCTTAATG CAATCATCGC CTAAGGCTTC TACATAAGCA 7200 

GCTGTCA7CG AAGCAATCCC TCATAAATGT TGTAAAAAAT TCAAGGCAAC GCGTTCACAT 7260 

GTTAAGAGAC TTCTCACCGA GCCTATGATT TCTAAAACCA AATCGCCACT AGTCAAACGA 7320 

TCCCCATCCT TAAATTGATG AGGATTCTGG AAGGTCACCT CGGCATCAAA TAGGGTAAAA 7 380 

ACCCTTTGAA AAACGGTTAG CCCCGCTAAA ACACCAGCTT CCTTGGCAAA AAGCGACACC 7440 

TTGGCTTGGC CATGATGATC AAAAATGGCA TTGGTACTGT AATCTTCGGA ATGAACATCT 7500 

TCTCGCAAGG CTGCTTTCAA TGTATCATCT ATTTGAAAAG GGGTTAAATC AGTTGAAATG 7560 

ATTGACATCA C 7571 
(2) INFORMATION FOR SEQ ID NO: 3: 

(U SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26385 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS : double 
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(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TTTCCTAGTG GCTTAAATTC TTCAGGAAAA TCAGGCGTAT CTAAAAGTCG TGTCGTTTTT 60 

GTTTCATCTA TATAAAGACT TCCTGCTCCC CCTACAACTA GAAAACGTGT CTGTGTTCCA 120 

GCAAGAAGCT GATTAAATAG TTCGATTGAT TTGCTGTGGA GCGGTAGCGT ATCTGGTGTA 180 

TAAGCACCAA ACGCTGAAAT AACAGCATCA AATCCAGTAA GATCATCTTT TGTCAACTCA 240 

AATAAATCTT TTTTAATAAT AGACTCAGCT TGACTTTTGT TTTCAGAACG AACAATAGCC 300 

GTTACTTCAT GTCCTCGTTT GACTGCTTCT TCAACAATTG CTTTCCCCGC TTGTCCATTT 360 

GCTGCAATAA CTGCTAGTTT CATTTTTTAT ACCTCTCTTG TTGTAATTAT TTTAGTTACA 420 

GAAATTGTGA CACTCTTAAT AATCAATGTC AATAGTCTTG CTTAATTATT ATCAAAATAT 4 80 

TTCTACCAAG AAAACTAACC ATGATTCTAG TGAAAAAAAA TCTTCTTTGT CAACAAATTT 540 

ACTTTCTTGT TTTAAACATG CTATAATAAT CATAGCAAGA GATCTAAGTT GTCTGTTTTT 600 

TTAAAACGAG GTGATTATCA TGCGTAGATT CTATTCCCAT CTCCCCTACT ATCTGGTCAT 660 

ATTATTCTTT TATTGGCCAC TTTATGAGTT GTTCTTACTA GTTGTTTCTC ACCCCCTTAC 720 

ACTCAAGGGA CTCTATATAA ACAATCTTCT CTTCTTTACA CCTCTGGTAA TCTTGATTGT 780 

ATCGTTACTC TATAGCTACC GTTTCCGTTT CTCACTTTGA TGGTTAGTTG GTAACGGACT 840 

GCTCTTTTAC TTTACTATCA TAACCTTTGG TGAGTTTATA CTAATTTACT TGCTAATCTA 900 

TGAAACAGTT GCTCTGGTCG GCATGGATTC TGGTATTAGC ATCAAGCATA TTCTACAAAA 960 

AATGAAAAAC AAAAAACTTT CACAAAATCC TTGAAAAATC TCACAATCAT GCTATAATAA 1020 

TCCATAGAGA CAAGTCACTT AGTCCCTTTC TACTAGACAG TGCGTGGTTG CTGGAAACGC 1080 

ATAGGAAGTC TAAACTGATA CTACTCTTGA GTTTTTTATG AAAACATAAA ACGGTGGCCA 1140 

CGTTAGAGCC GATCAGAGGT GTCCCTCTCT TTTGAGCTAC ATAAATGAAG GTGGAACCAC 1200 

GTTGCGACGT CCTTTCGAGG ATGTCGCATT TTTTTATTAG GATACTAATT ATGGAGTTCC 1260 

AAGAATTAGT GGAGCGCAGT TGGGCAATCC GACAAGCTTA TCACGAACTG GAAGTTAAGC 1320 

ATCATGATTC CAAGTGGACG GTAGAAGAAG ACCTCTTGGC TTTATCTAAT GATATTGGAA 1380 

ATTTCCAACG ACTGGTGATG ACAAAGCAAG GACGCTACTA TGATGAAACA CCCTACACAC 1440 

7GGAACAAAA ACTTTCAGAA AATATCTGGT GGCTATTAGA ACTTTCTCAA CGTTTGGATA 1500 

TAGACATTCT GACGGAAATG GAAAACTTCC TCTCTGATAA AGAAAAGCAA TTGAACGTTA 1560 

GGACTTGGAA CTAGTCTGCT GATAAAAAAT CAATGCTTAG AAACTATGAA ATAATAAAAA 1620 
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AGGAGAACAT CATGATTAAC ATTACTTTCC CAGATGGCGC TGTTCGTGAA TTCGAATCTC 1680 

GCGTAACAAC TTTTGAAATT GCCCAATCTA TCAGCAATTC CCTAGCTAAA AAAGCCTTGG 1740 

CTGGTAAATT CAACGGCAAA CTCATCGACA CTACTCGCGC TATCACTGAA GATGGAAGCA 1800 

TCGAAATTGT GACACCTGAT CACGAAGATG CCCTTCCAAT CTTGCGTCAC TCAGCAGCTC 1860 

ACTTGTTCGC CCAAGCAGCT CGTCGTCTTT TCCCAGACAT TCACTTGGGA GTTGGTCCAG 1920 

CCATCGAAGA TGGTTTCTAC TACGATACTG ACAACACAGC TGGTCAAATC TCTAACGAAG 1980 

ACCTTCCTCG TATCGAAGAA GAAATGCAAA AAATCGTCAA AGAAAACTTC CCATCTATTC 2040 

GTGAAGAAGT GACTAAAGAC GAGGCACGTG AAATCTTCAA AAATGACCCT TACAAGTTGG 2100 

AATTGATTGA AGAACACTCA GAAGACGAAG GCGGTTTGAC TATCTATCGT CAGGGTGAAT 2160 

ATGTAGACCT CTGCCGTGGA CCTCACGTTC CATCAACAGG TCGTATCCAA ATCTTCCACC 2220 

TTCTCCATGT AGCTGGTGCG TACTGGCGTG GAAACAGCGA CAACGCTATG ATGCAACGTA 2280 

TCTACGGTAC AGCTTGGTTT GACAAGAAAG ACTTGAAAAA CTACCTTCAA ATGCGTGAAG 2 340 

AAGCTAAGGA ACCTGACCAC CGTAAACTTG GTAAAGAGCT TGACCTCTTT ATGATTTCAC 2400 

AAGAAGTGGG ACAAGGTTTG CCATTCTGGT TGCCAAATGG TGCGACTATC CGTCGTGAAT 2460 

TGGAACGCTA CATCGTAAAC AAAGAGTTGG TTTCTGGCTA CCAACACCTC TACACTCCAC 2520 

CACTTGCTTC TGTTGAGCTT TACAAGACTT CTGGTCACTG GGATCATTAC CAAGAAGACA 2580 

TGTTCCCAAC CATGGACATG GGTGACGGGG AAGAATTTGT CCTTCGTCCA ATGAACTGTC 2640 

CGCACCACAT CCAAGTTTTC AAACACCATG TTCACTCTTA CCGTGAATTG CCAATCCGTA 2700 

TCGCTGAAAT CGGTATGATG CACCGTTACG AAAAATCTGG TGCCCTCACT GGCCTTCAAC 2760 

GTGTACGTGA AATGTCACTC AACGACGGTC ACCTATTCGT TACTCCACAA CAAATCCAAG 2820 

AAGAATTCCA ACGTGCCCTT CAGTTGATTA TCGATGTTTA TGAAGACTTC AACTTGACTG 2880 

ACTACCGCTT CCGCCTCTCT CTTCGTGACC CTCAAGATAC TCATAAGTAC TTTGATAACG 2940 

ATGAGATGTG GGAAAATGCC CAAACCATGC TTCGTGCAGC TCTTGATGAA ATGGGCGTGG 3000 

ACTACTTTGA AGCCGAAGGT GAAGCAGCCT TCTACGGACC AAAATTGGAT ATCCAGATTA 3060 

AAACTGCCCT TGGAAAAGAA GAAACCCTTT CTACTATCCA ACTTGATTTC TTGTTGCCAG 3120 

AACGCTTCGA CCTCAAATAC ATCGGAGCTG ATGGCGAAGA TCACCGTCCA GTCATGATCC 3180 

ACCGTGGGGT TATCTCAACT ATGGAACGCT TCACAGCTAT CTTGATTGAG AACTACAAGG 3240 

GGGCCTTCCC AACATGGCTG GCACCACACC AAGTAACCCT CATCCCAGTA TCTAACGAAA 3300 

AACACGTGGA CTACGCTTGG GAAGTGGCCA AGAAACTCCG TGACCGCGGT GTCCGTGCAG 3360 
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ACGTAGATGA GCGCAATGAA AAAATGCAGT TCAAGATCCG TGCTTCACAA ACCAGCAAGA 3420 

TTCCTTACCA ATTAATTGTT GGAGACAAAG AAATGGAAGA CGAAACAGTC AACGTTCGTC 34 90 

GCTACGGCCA AAAAGAAACA CAAACTGTCT CAGTTGATAA TTTTGTTCAA GCTATCCTAG 3540 

CTGATATCGC CAACAAATCA CGCGTTGAGA AATAAGAGTC TAGCATAAAA GCCTCCAATC 3600 

TGGAGGCTTT TTCTCATCTA TTTTTACTCA AGGACTAAGT TCACTTGAGC AAACTGAATC 3660 

CGCACTGTCG TTCCTTTTCC GACCTCAGAC TCGATACGAA TCTGGTGCCC CAGTTCTTCA 3720 

GAAATTTTCT TAGATAGATA AAGGCCAAGT CCAGAGGACT CCTGGGTCAA ACGGCCATTG 3780 

TATCCTCAAA AGCCACGTTC AAATACTCGG AGGACATCAC TGTTTTTTAT CCCGATTCCC 3840 

GTATCTTTGA TACAAAGCTC TTGGTCATCC ATATAAATCT CCAGACCACC TTCCTTGGTG 3900 

TACTTGAGAC TGTTTGAGAT GATTTGCTCA ATAACCACTA GCAGCCACTT TTTATCCGTC 3960 

ACGATTTCTT TATCAAGGTC ATGTAGATTG ACATTTAAGC CTTTTTGAAT AAAGAAAAGA 4020 

GCATATTTAC GAATTATTTC CTTGACCAAG TCCTCAATTT GAACCTGCTT TAAGACCAAA 4080 

TCATCATGGA AACTTTCTAA ACGCAGGTAC TGTAAAACTA GGTTGGTATA GGAGTCGATT 4140 

TTGAAAATTT CCTGTTCTAG CTGCTGCTTC AGTTGGCGGT CGACCACTTC TGCAACTAAG 4200 

AGTTGACTGG CTGCAATGGG GGTCTTTATC TGATGGACCC ACAAGGTATA GTAATCCAGC 4260 

AAATCCGTCA GTTTTCTTTC TGCTTTTGAC CTCTGCTGAT AGAGTTCCAT CTCACGCGCT 4320 

TCTAATTTTT CTGCTAAAGC TATTTCCAAA GGAGACTTGG CTTCCCTCTC TCCATAGAGA 4380 

AGTTCCTGGC GATAGACCTG CGTTTCCACC AATATGTCCC AAGTGAAAAA TAATATGGTT 4440 

ACAAAGCAAC ACAAGAAGAA AAAGTAGAGG AAGTAAATTC CTAGACTGGC AAATAAAAAC 4 500 

TGAAAGAGTA AGACAAGAAA TGCCAAAGAA AGCAGATAGA TAAAAAGACG ACTACGGGAG 4560 

CGCAGATAGG CTAGAAAAAA TTGTTTCCAA TCAAGCATGC TTCAATCCGT ACCCTATTCC 4620 

TTTCTTGGTC TCGATAAATC CTACCAATCC CTGCTCCTCC AACTTTTTAC GCAAACGAGC 4 680 

CACATTGACA GAGAGGGTAT TATCATCAAT GAAAAAGTCA CTGTTCCAAA GTTCCCGCAT 4740 

CAGGTCGTCA CGTGCTACGA TGTTGCCTGC ATGCTCAAAT AACACGCGTA AAATCTGGAA 4800 

TTCATTCTTG GTCAAATTCA AGACTTGCCC TTGATAATGT AAATCCATGG ATTTGGTATT 4860 

GAGGATAACA CCAGCATATT CCAGCAAACT CTCATCACGC CCAAACTCAT AGGAACGACG 4920 

CAACAAGCCC TGAACCTTAG CTAAAAGAAC CTGCTGGTCA AAAGGCTTGG TCACAAAGTC 4980 

ATCCGCCCCC ATATTGATTG CCATGACAAT ATCCATAGCC TGGTCTCTCG AAGAAAGAAA 5040 

CATGATAGGT ACCTTGGAAA TCTTGCGGAT TTCCTGACAC CAGTGATAAC CATTAAACAA 5100 

GGGCAAACCA ATATCCATGA GGACCAGATG AGGTTCCGAC TGAACAAATA GACTCAAAAC 5160 
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TTCCATAAAG 


TCTTCTACCA 


GGACCACTTC 


AAATCCCCAT 


TCAGAGAGCA 


TTTTCCCAAT 


5220 


CTGTTGACGA 


ATGACCTGAT 


CATCTTCTAT 


TAATAAAATC 


TTGTGCATGC 


GCTTCTCCTT 


5280 


TTCCATTATT 


ATAACAGATT 


TTTCCATGCT 


AGATGGTCTG 


AAACTGAATT 


TGAAATAGCC 


5340 


TGTTTTTAGC 


CAGTACAAAC 


AGGCTATGCT 


ACTAGCTAAT 


TTGAGGGAAA 


TTTGCTAAGA 


5400 


TAAATAAAAA 


GAAAGGAGCT 


CTTATGGCCA 


ATATTTTTGA 


CTATCTGAAA 


GATGTCGCAT 


5460 


ATGATTCTTA 


TTACGACCTT 


CCCTTGAATG 


AGTTAGACAT 


TCTAACCTTA 


ATAGAAATCA 


5520 


CCTACCTCTC 


CTTTGATAAT 


CTGGTCTCCA 


CACTTCCTCA 


ACGTCTTTTA 


GATCTAGCAC 


5580 


CTCAGGTTCC 


AAGAGATCCC 


ACCATGCTTA 


CTAGCAAAAA 


TCGCCTTCAA 


TTATTAGATG 


5640 


AATTGGCTCA 


ACACAAGCGC 


TTCAAAAATT 


GCAAACTCTC 


CCATTTTATC 


AACGACATCG 


5700 


ACCCTGAACT 


GCAAAAGCAA 


TTTGCGGCTA 


TGACTTATCG 


TGTCAGCCTC 


GATACCTATC 


5760 


TGATTGTCTT 


TCGTGGGACA 


GATGACAGTA 


TCATTGGCTG 


GAAGGAAGAT 


TTCCACCTGA 


5820 


CCTATATGAA 


GGAAATTCCT 


GCTCAAAAGC 


ACGCCCTTCG 


CTATTTAAAG 


AACTTTTTTG 


5880 


CCCATCATCC 


TAAGCAAAAC 


GTTATTCTAG 


CTGGGCATTC 


CAAGGGAGGA 


AATCTCGCTA 


5940 


TCTATGCTGC 


TAGCCAAATT 


GAGCAAAGTT 


TGCAAAATCA 


GATCACAGCA 


CTTTATACAT 


6000 


TTGATGCACC 


TGGTCTCCAT 


CAAGAATTGA 


CACAGACTGC 


GCGTTATCAA 


AGGATAATGG 


6060 


ATAGAAGCAA 


GATATTCATT 


CCACAAGGTT 


CCATTATCGG 


TATGATGCTG 


GAAATTCCTG 


6120 


CTCACCAAAT 


CATCGTTCAG 


AGTACTGCCC 


TGGGTGGCAT 


CGCCCAGCAC 


GATACCTTTA 


6180 


GTTGGCAGAT 


TGAGGACAAG 


CACTTCGTCC 


AACTGGATAA 


GACCAACAGT 


GATAGCCAGC 


6240 


AAGTAGACAC 


AACCTTTAAA 


GAATGGGTGG 


CCACAGTCCC 


TGACCAAGAA 


CTTCAGCTCT 


6300 


ACTTCGACCT 


CTTCTTTGGC 


ACTATTCTTG 


ATGCTGCTAT 


TAGCTCTATC 


AATGACTTCG 


6360 


CTTCCTTAAA 


GGCGCTTGAA 


TACATTCATC 


ATCTCTTTGT 


CCAAGCTCAA 


TCCCTCACTC 


6420 


CAGAAGAAAG 


AGAAACCTTG 


GGTCGCCTTA 


CCCAGTTATT 


GATTGATACT 


CGTTACCAGC 


6480 


CATGGAAAAA 


TAGATAATAC 


TCTTGAAAAT 


TAAATGTATA 


CAAAACAAAA 


GACCTAGAAT 


6540 


ACATACTTTC 


ATGTGCATTC 


TAAGTCTTTT 


TAAATAGAAT 


CTAATAGTCA 


ATAAAAATCA 


6600 


AAGAGCATTG 


AGAGATAATG 


GGGCTTGGAA 


CGTCCCTCTC 


GCTTCAACAA 


AATGACCCCA 


6660 


TTATAGATTA 


AAAAGATGCC 


ACTTAGAAAA 


AGCAAAAAAG 


GAAGTAAGAC 


AAAGGCAAAT 


6720 


ATATAAAAAG 


CTAACTGAAC 


ATTCTCGTAT 


CCATTTTTAT 


AAAAAAGGTA 


GGATAGATAA 


6780 


AAATAACTTG 


AAATGAGGGA 


TAATAAAAAT 


AATACTGGAT 


TCCACAAACT 


TCTATTATCC 


6840 


TTCCAAAATG 


ACACTATAAA 


GGCTAATACA 


ATTCCTATAA 


CGAGATACAT 


TTCTTACTCC 


6900 
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TTTAATAGCT 


ACA'i M rrrATc 


ATAATTATCC 


AAAGAAAAAA 


GAGGGCATTT 


ATCCCTCTTA 


6960 


ATCCTTCATC 


TGACTCTCTG 


CATCGGCCAC 


GACTTTTTCT 


AGACTGGTTT 


GACCAAGTTC 


7020 


TGCCTCCATA 


GTCAACTGAA 


TTCTCTCCAA 


TTTTTGATCC 


AAAACATCAT 


GAATATGAGC 


7080 


TCCTACAGGG 


CAATTTGGAT 


TCGGATTGTC 


ATGGAAACTG 


AAGAGTTGAC 


CTGTCTTACC 


7140 


AAGACATTCG 


ACCGCCTGAT 


AAACATCTAA 


AAGACTAATA 


TCCTTAAGGT 


CCTTGACAAT 


7200 


CTCTGTTCCG 


CCCGTTCCAC 


GCGCTACTGA 


AATCAGCTCT 


GCCTTCTTCA 


ACTGGGACAA 


7260 


GATCTTTCTG 


ATAATGACAG 


GATTGACCCC 


GACACTAGCA 


GCCAGAAAAT 


CACTGGTCAC 


7320 


CTTGCTTTCC 


TTCCCCTCGA 


GGGCAATGAT 


TATCAGCATA 


. TG AGTCGCAA 


TGGTAAATCT 


7380 


ACTTGGAATT 


TGCATCCTCT 


TCTCCTTTTT 


ACGAGGCTAC 


CCTGCCTCTA 


CTCTTCTTTT 


7440 


TCTATTATTA 


TACCCTTTTT 


AGTTGTAATG 


TCAATCGTTA 


CCACTTTTCA 


ACCAGTCGTC 


7500 


TAACTCCCGA 


TCGCAGCCCT 


CTTTCTGAGC 


CAATTCTCTC 


AAAAATTCCT 


GATGATGAGT 


7560 


ATGGTGGATC 


CCATTGACCA 


GACTTTCATA 


GTAAACCTCA 


AAATAGGGAA 


GTCTCAGGTC 


7620 


TTTAGCCAGC 


TGCAATTCAG 


CTGCTACATC 


GTAGTCTACC 


CGTCGGAAGT 


CCATATCTAC 


7680 

r \J W W 


CAGGCCTTTG 


TCATCAAACT 


CCAAAATCAT 


ATACTGGGCC 


CGCAAGTCCT 


TCCGTAGCTG 


7740 


AGCGTCCAAA 


AAGAAAGGTT 


GGCCAATCGA 


ACCCGGATTG 


ACAATCAATT 


GCCCACCAGT 


7800 

i \J \J \J 


CCCGTAACGA 


AGCAACTGCT 


GGTGAATATG 


TCCATAAACA 


GCAATATCAC 


AGGGAGGATG 


7860 


AGTCACCAAG 


CGGTCAAACT 


CCTCTTGTTT 


GCCAGTATGA 


ATCAACTCTC 


GCCCCCAGTT 


7920 


CTTATCAGGC 


AGATGATGGC 


TAATTCCCAC 


CGTCAAArCC 


CCAAACTGAC 


GATGAATTTC 


7980 


AAGAGGTTGA 


TTGTGGAGCA 


CTTCAATTTC 


TTCTAGGGAA 


ATTTCCTCTA 


AAACATACTG 


8040 


GCACTGGCGC 


AAGAGATAGC 


GTTGACTGGG 


GCGAGTACTG 


TCCAATTCCT 


TACGGACACC 


8100 


ATGCCAAAGA 


CTGTCTTCCC 


AGTTTCCCAA 


AACTCTAGCC 


GTAATCGGTA 


GTTGATCCAA 


8160 


CAAGTCCAAA 


ATCCTTCTAC 


GCCCTGTCCC 


TGGCATGAGA 


ATATCTCCCA 


AAAGCCAGTA 


8220 


TTCATCCACT 


CCTATCTGCC 


GAGCATCTGC 


CAAAACAGCC 


TCCAAGGCGG 


TGGTATTTCC 


8280 


ATGAATATCT 


GAAAGAAGAG 


CTATTTTCGT 


CATATCCATC 


TCCTCGTTTT 


TTCTCTTGCA 


3340 


ATAAGTATAA 


CATAAAAAGT 


CACAGCTAGA 


GAAATCTAGC 


TTTTTTTGAT 


ATACTAGATA 


8400 


AAGATATTAG 


ACAAGAGGAA 


ACGAATGACC 


CCAAACAAAG 


AAGACTATCT 


AAAATGTATT 


8460 


TATGAAATTG 


GCATAGACCT 


GCATAAGATT 


ACCAACAAGG 


AAATTGCGGC 


TCGCATGCAA 


8520 


GTCTCTCCCC 


CTGCCGTAAC 


TGAAATGATC 


AAACGAATGA 


AAAGTGAAAA 


TCTCATCCTA 


8580 


AAGGACAAGG 


AATGTGGCTA 


TCTACTGACT 


GACCTCGGTC 


TCAAACTGGT 


CTCTGAGCTC 


8640 


TATCGTAAGC 


ACCGCTTGAT 


TGAAGTTTTT 


CTAGTTCATC 


ATTTAGACTA 


TACAAGTGAC 


8700 
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CAGATTCACG AGGAAGCTGA GGTCTTGGAA CACACTGTCT CTGACCTGTT CGTGGAAAGA 8760 

CTAGATAAAC TGCTAGGTTT CCCTAAAACC TGCCCCCACG GGGGAACTAT TCCTGCCAAG 8820 

GGAGAACTAC TCGTTGAAAT CAATAACCTC CCACTAGCTG ATATCAAGGA AGCTGGCGCC 8880 

TACCGCCTGA CTCGGGTGCA CGATAGTTTT GACATTCTCC ATTATCTGGA CAAGCACTCA 8940 

CTTCACATCG GTGACCAGCT CCAAGTCAAG CAGTTTGATG GCTTCAGCAA TACCTTCACT 9000 

ATCCTCAGTA ACGACGAGGA TTTACAAGTG AATATGGACA TTGCAAAACA ACTCTATGTC 9060 

GAGAAAATCA ACTAATTTCT CAAGTCCCCT ACCAACCCTG AAAGTTTTAT TTTGGCTCTT 9120 

TGTCAACTGT AGTGGGTTGA AGTCAGCTAA GCTCGAGAAA GGACAAATTT TGTCCTTTCT 9180 

TTTTTGATAT TCAGAGCGAT AAAAATCCGT TTTTTGAAGT TTTCAAAGTT CCGAAAACCA 9240 

AAGGCATTGC GCTTGATAAG TTTGATGAGA TTATTGGTCG CTTCCAGTTT GGCATTAGAA 930O 

TAGTGTAGTT GAAGGGCCTT GACAATCTTT TCTTTATCTT TGAGGAAGGT TTTAAAGACA 9360 

GTCTGAAAAA TAGGATGAAC CTGCTTTAGA TTGTCCTCAA TGACTCCGAA AAATTTCTCC 9420 

GGTTTCTTAT TCTGAAAGTG AAACAGCAAG AGTTGATAGA GCTGATAGTG GTGTTTCAAG 9480 

TCTTGTGAAT AGCTCAAAAG CTTGTCTAAA ATCTCTTTAT TGGTTAAGTG CATACGAAAA 9540 

GTAGGACGAT AAAATCGCTT ATCACTCAGT TTACGGCTAT CCTGTTGTAT GAGCTTCCAG 9 600 

TAGCGCTTGA TAGCCTTGTA TTCATGGGAT TTTCGATCCA ATTGGTTCAT AATTTGAACA 9660 

CGCACACGAC TCATAGCACG GCTAAGATGT TCTACAATGT GAAAGCGATC CAACACGATT 9720 

TTAGCATTCG GGAGTGAAAC AGTCTGGGAG ACTGTTTCAG CCTGAGCCTA GAAATTTGAA 9780 

AGCGAAGCTG TTTAGCCAAG TCATAGTAAG GACTAAACAT ATCCATCGTA ATGATTTTCA 9 840 

CTTGACAACG AACGGCTCTA TCGTAGCGAA GAAAGTGATT TCGGATGACA GCTTGTGTTC 9900 

TGCCTTCAAG AACAGTGATA ATATTAAGAT TATCAAAATC TTGCGCAATG AAACTCATCT 99 60 

TTCCCTTAGT GAAGGCATAC TCATCCCAAG ACATAATCT? TGGAAGCCGA GAAAAATCAT 10020 

GCTCAAAGTG AAAGTCATTG AGCTTGCGAA TGACAGTTGA AGTTGAAATG CCCAGCTGAT 10080 

GGGCAATATC AGTCATAGAA ATTTTTTCAA TTAACTTTTG AGCAATyTTT TGGTTGATGA 10140 

TACGAGGGAT TTGGTGATTT TTCTTTACCA GGGGAGTCTC AGCAACCATC ATTTTTGAAC 10200 

AGTGATAGCA CTTGAAACGA CGCTTTCTAA GGAGAATTCT AGAAGGCATA CCAGTCGTTT 10260 

CAAGATAAGG AATTTTAGAA GGTTTTTGAA AGTCATATTT CTTCAATTGG TTTCCGCACT 10320 

CAGGGCAAGA TGGGGCGTCG TAGTCCAGTT TGGCGATGAT TTCCTTGTGT GTATCCTTAT 10380 

TGATGATGTC TAAAATCTGG ATATTAGGGT CTTTAATGTC TAGTAATTTT GTGATAAAAT 10440 
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GTAATTGTTC CATATGATTC TTTCTAATGA GTTGTTTTGT CGCTTTTCAT TATAGGTCAT 10500 

ATGGGACTTT TTTTCTACAA TAAAATAGGC TCCATAATAT CTATAGTGGA TTTACCCACT 10560 

ACAAATATTA TAGAACCGTA AAAATAGAAG GAGATAGCAG GTTTTCAAGC CTGCTATCTT 10620 

TTTTTGATGA CATTCAGGCT GATACGAAAT CATAAGAGGT CTGAAACTAC TTTCAGAGTA 10680 

GTCTGTTCTA TAAAATATAG TAGATTGAAA TAAGATGTGA ACAACTCTAT CAGGAAAGTC 10740 

AAATTAATTT ATAGAATTAT TTTAGCAGTC AAGGTGTACT GTTATAGATT CAATATATTA 10800 

TATGACTATT AACCTTGTCT TCTCCTAAAA TTGACTTTCT TGTTTTCTTA TCTTGTCCAC 10860 

TCGAAACAAG TATTGTAAGA ATTTGATTAT TTTTGAAAGT ACTTTTAATA TACTTGATAT 10920 

AGTTAAAAAA GATTTGAAAC TAAATTCCAA ATTAGAAAAA GACTTGAAAT ACTAAAAAAA 10980 

AAAAAGTATA CTCTAATTGA AAACGGTAAC AAAACTAATT TAGAGAATGA AATATAGAGT 11040 

ATTTCTCTCT TAAAAGTTTT TGGTGAAACG AGATGTAGAA AGGAGATTTA GCCAAAGAGT 11100 

CTATTAGTGC TAGAATAATA GATTAGAATT ATTTTAGAAA AACGAAGTGA GCAGCTTATA 11160 

AATTCAAGTC CCCAAATAGA TTCATACTAG TATCTTTTGC AAAAAATAAA GGGCGACTTC 112 20 

CTTCATGAAT ATCAATTTCA TCTATAAGGA AGGTAGCTAA TTGAACTAAC TTATTTATTC 11280 

TCTTTGTCGC TAGAAAAATC AGACCTCCTT GTGAAGATTG AGGAGATACT TAATGAAAAT 11340 

CAAAGAAGAA ACTAGCAAGC TAGTAGCAGA TTGCCCAAAA CACCGCTTTG AGGTTGTAGA 11400 

TAAGACTGAC CTATATAATC CAAGGTGAAG CGACTGTGGT TTGAAGAGAT TTTCAAAGAG 11460 

TATAGGCTAG AGAGTAGTGT TTTTATGTCC TTCTAGTACA AAATGCTAGA CAGAAGAATG 11520 

GGGAACTTGG ATAGGAAAAA TAGATTGAGA AAGGAGGTTA GAAGAGATGA TTATTACAAA 11580 

AATTAGCCGT TTAGGAACTT ATGTGGGAGT AAATCCACAT TTTGCAACAT TAATAGATTT 11640 

TCTAGAAAAA ACAGGACTAG AAAATTTAAC AGAAGGTTCG ATTGCTATCG ATGGTAATCG 11700 

ATTGTTTGGG AATTGCTTTA CTTATCTAGC AGATGGTCAA GCAGGGGCTT TCTTTGAAAC 117 60 

CCACCAAAAA TATTTGGATA TTCATTTAGT TTTGGAAAAC GAAGAAGCCA TGGCTGTTAC 11820 

ATCGCCGCAA AATGTAAGCG TTACCCAAGA ATATGATGAA GAGAAAGATA TTGAATTATA 11880 

CACAGGGAAA GTGGAACAGT TGGTTCATTT GAGAGCTGGC GAATGCCTCA TCACTTTTCC 11940 

AGAAGATTTA CATCAACCCA AGGTTCCTAT AAATGATGAA CCTGTGAAAA AAGTTGTCTT 12000 

TAAAGTTGCG ATTTCTTAAT GTAGAAAGAG AAGAACGATG AAAAAAATGA GAAAGTTTTT 12060 

ATGTCTAGCT GGAATTGCGC TAGCGGCTGT TGCCTTGGTA GCTTGTTCAG GAAAAAAAGA 12120 

AGCTACAACT AGTACTGAAC CACCAACAGA ATTATCTGGT GAGATTACAA TGTGGCACTC 12180 

CTTTACTCAA GGACCCCGTT TAGAAAGTAT TCAAAAATCA GCAGATGCTT TCATGCAAAA 12240 
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GCATCCAAAA ACGAAAATCA AGATTGAAAC ATTTTCTTGG AATGACTTCT ATACTAAATG 12300 

GACTACAGGT TTAGCAAATG GAAATCTGCC AGATATCAGT ACAGCTCTTC CTAACCAAGT 12360 

AATGGAAATG GTCAACTCAG ATGCTTTGGT TCCGCTAAAT GATTCTATCA AGCGTATTGG 12420 

ACAAGATAAA TTTAACGAAA CTGCCTTAAA TGAAGCAAAA ATCGGAGATG ATT AC TACT C 124 80 

TGTTCCTCTT TATTCACATG CACAAGTCAT GTGGGTTAGA ACAGATTTCT TAAAAGAACA 12540 

TAATATTGAG GTTCCTAAAA CTTGGGATCA ACTCTATGAA GCTTCTAAAA AATTGAAAGA 12 600 

AGCTGGAGTT TATGGCTTGT CTGTTCCGTT TGGAACAAAT GACTTAATGG CAACACGTTT 12660 

CTTGAACTTC TACGTACGTA rTGGTGGAGG AAGCCTCTTA ACAAAAGATC TTAAAGCAGA 12720 

CTTGACAAGC CAACTTGCTC AAGATGGTAT TAAATACTGG GTTAAATTGT ATAAAGAAAT 12780 

CTCACCTCAA GATTCTTTGA ACTTTAATGT CCTTCAACAA GCTACCTTGT TCTATCAAGG 12840 

AAAAACAGCA TTTGACTTTA ACTCTGGCTT CCATATCGGA GGAATTAATG CCAACAGTCC 12900 

TCAATTGATT GATTCGATTG ATGCTTATCC TATTCCAAAA ATCAAAGAGT CTGATAAAGA 12960 

CCAAGGAATT GAAACCTCAA ACATTCCAAT GGTTGTTTGG AAAAATTCAA AACATCCAGA 13020 

AGTTGCTAAA GCATTCTTAG AAGCACTTTA TAATGAAGAA GACTACGTTA AATTCCTTGA 13080 

TTCAACTCCA GTAGGTATGT TGCCAACTAT TAAGGGGATT AGCGATTCTG CAGCCTATAA 13140 

AGAAAATGAA ACTCGTAAGA AATTTAAACA TGCTGAAGAA GTAATTACTG AAGCTGTTAA 13200 

AAAAGGTACT GCTATTGGTT ATGAAAATGG GCCAAGTGTA CAAGCTGGTA TGTTGACTAA 13260 

CCAACACATT ATTCAACAAA TGTTCCAAGA TATCATTACA AATGGAACAG ATCCTATGAA 13 320 

AGCAGCAAAA GAAGCAGAAA AACAATTAAA TGATTTATTT GAGGCTGTTC AGTAGATGTA 13 380 

AAAGACTAGA AAATAGGTGG GATAGTGAGC TGAAAAGCTC TAGCCCAATC TTGTAAAAGA 13440 

AGGGAGAAGG AGAATGGTTA AAGAACGTAA TTTAACTCGC TGGATATTTG TTTTGCCAGC 13500 

TATGATTATC GTAGCATTAC TCTTT G TTTA TCCGTTTTTC TCGAGTATTT TTTATAGCTT 13 560 

TACCAATAAG CATTTGATTA TGCCTAATTA TAAATTTGTT GGTTTGGCTA ACTATAAAGC 13620 

TGTGCTATCA GATCCCAACT TCTTTAATGC GTTCTTTAAT TCAATTAAGT GGACCGTTTT 13 680 

CTCATTAGTT GGTCAAGTTT TAGTAGGGTT TGTATTGGCT TTAGCTCTTC ACAGAGTACG 13740 

CCACTTCAAG AAATTATATA GGACATTATT GATTGTTCCT TGGGCATTTC CTACCATCGT 13800 

TATTGCCTTC TCTTGGCAGT GGATTCTAAA CGGGGTTTAT GGCTACTTAC CTAATCTAAT 13860 

CGTAAAATTA GGTTTAATGG AACATACACC TGCATTTTTG ACAGATAGTA CATGGGCATT 13920 

CCTATGTTTG GTGTTTATCA ACATTTGGTT TGGAGCACCA ATGATTATGG TTAATGTGCT 13980 
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TTCAGCTTTG 


CAAACAGTAC 


CAGAAGAACA 


ATTTGAGCCT 


GCTAAGATAG 


ATGGTGCTTC 


14040 


AAGTTGGCAG 


GTGTTCAAGT 


TTATCGTCTT 


TCCACATATT 


AAAGTGGTTG 


TAGGACTTCT 


14100 


AGTTGTTTTG 


AGAACTGTAT 


GGATCTTTAA 


TAACTTTGAC 


ATTATCTACC 


TCATTACTGG 


14160 


TGGTGGACCA 


GCCAATGCTA 


CAACGACGCT 


TCCAATTTTT 


GCTTACAACC 


TGGGCTGGGG 


14220 


AACTAAATTG 


TTGGGTCGTG 


CTTCAGCAGT 


TACAGTACTG 


CTCTTTATCT 


TCTTGGTGGC 


14280 


GATTTGCTTT 


ATCTACTTTG 


CTATCATCAG 


TAAGTGGGAA 


AAGGAGGGTA 


GAAAATAATG 


14340 


AAGAAGAAAT 


CCAGTATTTA 


TTTAGATATT 


CTCTCACATG 


TACTTTTAGT 


TGGTGCGACC 


14400 


ATCGTTGCAG 


TTTTCCCATT 


GGTATGGATT 


ATCATATCTT 


CTGTCAAAGG 


GAAAGGGGAA 


14460 


TTAACTCAGT 


ATCCAACACG 


ATTTTGGCCT 


GAACAGTTTA 


CATTAGATTA 


TTTCACTCAT 


14520 


GTTATCAACG 


ATTTGCACTT 


CATTGATAAC 


ATTCGAAACA 


GTTTAATCAT 


TGCCTTGGCT 


14580 


ACAACCCTTA 


TTGCGATTAT 


TATTTCTGCT 


ATGGCAGCCT 


ATCGTATTGT 


TCGATTCTTT 


14640 


CCTAAATTGG 


GAGCAATCAT 


GTCGAGACTA 


CTCGTCATTA 


CCTACATTTT 


CCCACCAATT 


14700 


TTGTTAGCAA 


TTCCCTATTC 


AATTGCCATT 


GCTAAAGTTG 


GGTTAACAAA 


TAGTTTATTT 


14760 


GGCTTGATGA 


TGGTTTATCT 


ATCTTTTAGT 


GTTCCATATG 


CAGTTTGGCT 


CTTAGTTGGA 


14820 


TTTTTCCAAA 


CAGTTCCAAT 


TGGAATTGAA 


GAAGCGGCTA 


GAATTGATGG 


TGCAAATAAA 


14880 


TTTGTTACGT 


TTTATAAAGT 


TGTGCTACCG 


ATTGTAGCAC 


CAGGTATTGT 


AGCAACAGCT 


14940 


ATTTATACAT 


TTATCAATGC 


TTGGAATGAA 


TTCCTGTATG 


CCTTGATTTT 


GATTAACAAT 


150O0 


ACAGGAAAGA 


TGACAGTAGC 


AGTAGCCCTT 


CGTTCACTTA 


ATGGTTCAGA 


AATACTAGAC 


15060 


TGGGGAGATA 


TGATGGCAGC 


GTCTGTTATT 


GTAGTTCTTC 


CATCAATTAT 


TTTCTTCTCT 


15120 


ATCATCCAAA 


ATAAGATTGC 


AAGTGGATTA 


TCAGAAGGAT 


CTGTGAAGTA 


GACGAAAGAA 


15180 


GGAAAAAAAT 


GAATAAAAGA 


GGTCTTTATT 


CAAAACTAGG 


AATTTCCGTT 


GTAGGCATTA 


15240 


GTCTTTTAAT 


GGGAGTCCCC 


ACTTTGATTC 


ATGCGAATGA 


ATTAAACTAT 


GGTCAACTGT 


15300 


CCATATCTCC 


TATTTTTCAA 


GGAGGTTCAT 


ATCAACTGAA 


CAATAAGAGT 


ATAGATATLA 


I j JgU 


GCTCTTTGTT 


ATTAGATAAA 


TTGTCTGGAG 


AGAGTCAGAC 


AGTAGTAATG 


AAATTTAAAG 


15420 


CAGATAAACC 


AAACTCTCTT 


CAAGCTTTGT 


TTGGCCTATC 


TAATAGTAAA 


GCAGGCTTTA 


15480 


AAAATAATTA 


CTTTTCAATT 


TTCATGAGAG 


ATTCTGGTGA 


GATAGGTGTA 


GAAATAAGAG 


15540 


ACGCCCAAAA 


GGGAATAAAT 


TATTTATTTT 


CCAGACCAGC 


TTCATTATGG 


GGAAAACATA 


15600 


AAGGACAGGC 


AGTTGAAAAT 


ACACTAGTAT 


TTGTATCTGA 


TTCTAAAGAT 


AAAACATACA 


15660 


CAATGTATGT 


TAATGGAATA 


GAAGTGTTCT 


CTGAAACAGT 


TGATACATTT 


TTGCCAATTT 


15720 


CAAATATAAA 


TGGTATAGAT 


AAGGCAACAC 


TAGGAGCTGT 


TAATCGTGAA 


GGTAAGGAAC 


15780 
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ATTACCTCGC 


AAAAGGAAGT 


ATTGATGAAA 


TCAGTCTATT 


TAACAAAGCA 


ATTAGTGATC 


15840 


AGGAAGTTTC 


AACTATTCCC 


TTGTCAAATC 


CATTTCAGTT 


AATTTTCCAA 


TCAGGAGATT 


15900 


CTACTCAAGC 


TAACTATTTT 


AGAATACCGA 


CACTATATAC 


ATTAAGTAGT 


GGAAGAGTTC 


15960 


TATCAAGTAT 


TGATGCACGT 


TATGGTGGGA 


CTCATGATTC 


TAAAAGTAAG 


ATTAATATTG 


16020 


CCACTTCTTA 


TAGTGATGAT 


AATGGGAAAA 


CGTGGAGTGA 


GCCAATTTTT 


GCTATGAAGT 


16080 


TTAATGACTA 


TGAGGAGCAG 


TTAGTTTACT 


GGCCACGAGA 


TAATAAATTA 


AACAATAGTC 


16140 


AAATTAGTGG 


AAGTGCTTCA 


TTCATAGATT 


CATCCATTGT 


TGAAGATAAA 


AAATCTGGGA 


16200 


AAACGATATT 


ACTAGCTGAT 


GTTATGCCTG 


CGGGTATTGG 


AAATAATAAT 


GCAAATAAAG 


16260 


CCGACTCAGG 


TTTTAAAGAA 


ATAAATGGTC 


ATTATTATTT 


AAAACTAAAG 


AAGAATGGAG 


16320 


ATAACGATTT 


CCGTTATACA 


GTTAGAGAAA 


ATGGTGTCGT 


TTATAATGAA 


ACAACTAATA 


16380 


AACCTACAAA 


TTATACTATA 


AATGATAAGT 


ATGAAGTTTT 


GGAGGGAGGA 


AAGTCTTTAA 


16440 


CAGTCGAACA 


ATATTCGGTT 


GATTTTGATA 


CTGGCTCTTT 


AAGAGAAAGG 


CATAATGGAA 


16500 


AACAGGTTCC 


TATGAATGTT 


TTCTACAAAG 


ATTCGTTATT 


TAAAGTGACT 


CCTACTAATT 


16560 


ATATAGCAAT 


GACAACTAGT 


CAGAATAGAG 


GAGAGAGTTG 


GGAACAATTT 


AAGTTGTTCC 


16620 


CTCCGTTCTT 


AGGAGAAAAA 


CATAATGGAA 


CTTACTTATG 


TCCCCGACAA 


GGTTTAGCAT 


16680 


TAAAATCAAG 


TAACAGATTG 


ATTTTTGCAA 


CATATACTAG 


TGGAGAACTA 


ACCTATCTCA 


16740 


TTTCTGATGA 


TAGTGGTCAA 


ACATGGAAGA 


AATCCTCACC 


TTCAATTCCG 


TTTAAAAATG 


16800 


CAACAGCAGA 


AGCACAAATG 


GTTGAACTGA 


GAGATGGTGT 


GATTAGAACA 


TTCTTTAGAA 


16860 


CCACTACAGG 


TAAGATAGCT 


TATATGACTA 


GTAGAGATTC 


TGGAGAAACA 


TGGTCGAAAG 


16920 


TTTCGTATAT 


TGATGGAATC 


CAACAAACTT 


CATATGGCAC 


ACAAGTATCT 


GCAATTAAAT 


16980 


ACTCTCAATT 


AATTGATGGA 


AAAGAAGCAG 


TCATTTTGAC 


TACACCAAAT 


TCTAGAAGTG 


17040 


GCCGCAAGGG 


AGGCCAATTA 


GTTGTCGGTT 


TACTCAATAA 


AGAAGATGAT 


AGTATTGATT 


17100 


GGAAATACCA 


CTATGATATT 


GATTTGCCTT 


CGTATGGTTA 


TGCCTATTCT 


GCGATTACAG 


17160 


AATTGCCAAA 


TCATCACATA 


GGTGTACTGT 


TTGAAAAATA 


TGATTCGTGG 


TCGAGAAATG 


17220 


AATTGCATTT 


AAGCAATGTA 


GTTCAGTATA 


TAGATTTGGA 


AATTAATGAT 


TTAACAAAAT 


17280 


AAAGGAGAAA 


AACATGGTTA 


AATACGGTGT 


TGTTGGAACA 


GGGTATTTTG 


GAGCTGAATT 


17340 


GGCTCGCTAC 


ATGCAAAAGA 


ATGATGGAGC 


AGAGATTACT 


CTTCTCTATG 


ATCCAGATAA 


17400 


TGCAGAGGCG 


ATTGCAGAAG 


AATTGGGAGC 


AAAAGTAGCA 


AGTTCCTTAG 


ATGAGTTGGT 


17460 


TTCTAGCGAT 


GAAGTAGATT 


GTGTTATCGT 


CGCAACTCCA 


AATAATCTTC 


ATAAGGAACC 


17520 
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GGTTATTAAG 


GCTGCACAGC 


ATGGTAAAAA 


TGTTTTCTGT 


GAWtftACCAA 


TTGCGCTTTC 


17580 


TTATCAAGAT 


TGTCGCGAGA 


TGGTAGATGC 


GTGTAAAGAA 


AACAATGTAA 


CCTTTATGGC 


17640 


AGGACATATT 


ATGAATTTCT 


TTAATGGTGT 


TCATCATGCA 


AAAGAACTCA 


TTAATCAAGG 


17700 


AGTTATCGGA 


GACGTTCTAT 


ATTGTCATAC 


AGCTCGTAAT 


GGTTGGGAAG 


AACAACAACC 


17760 


GTCAGTATCA 


TGGAAAAAAA 


TTCGTGAAAA 


ATCAGGTGGT 


CACTTGTATC 


ACCACATCCA 


17820 


TGAATTGGAT 


TGCGTTCAAT 


TCCTTATGGG 


GGGCATGCCT 


GAAACTGTAA 


CCATGACAGG 


17880 


TGGAAATGTG 


GCCCATGAAG 


GTGAACATTT 


CGGTGATGAA 


GATGATATGA 


TTTTTGTCAA 


17940 


TATGGAATTT 


TCTAATAAGC 


GTTTTGCCTT 


GTTAGAATGG 


GGTTCAGCTT 


ATCGTTGGGG 


18000 


TGAACATTAT 


GTCTTAATCC 


AAGGAAGCAA 


AGGTGCCATC 


CGCTTAGACT 


TATTCAACTG 


18060 


TAAAGGAACT 


CTTAAGCTAG 


ATGGGCAAGA 


AAGCTATTTC 


TTGATTCACG 


AATCGCAAGA 


18120 


AHAAGATCAT 


GATCGGACTC 


GTATCTATCA 


TAGTACAGAG 


ATGGATGGAG 


CAATTCCTTA 


18180 


TGGTAAACCA 


GGTAAACGTA 


CTCCATTATG 


GCTATCATCT 


GTCATTGATA 


AAGAAATGCG 


18240 


CTATCTGCAT 


GAGATTATGG 


AAGGAGCTCC 


AGTATCAGAA 


GAATTTGCAA 


AACTTTTGAC 


18300 




GCCCTAGAAG 


CAATTGCTAC 


TGCAGATGCT 


TGTACCCAGT 


CTATGTTTGA 


18360 


AfiATr i r;f*AAA 

nun 1 LO^AAA 


GTAAAATTGT 


CAGAAATTGT 


AAAATAAATT 


TTGGTATTCT 


CCTATTTATA 


18420 




CTCCTCTGAA 


AGTACTTTTA 


GAGGAGXZTGT 


TTGACTTTGC 


TAGTTTTTGA 


18480 


AACTGAAATC 


TATTATACTA 


CAAACTATTG 


AAAGCGTTTT 


AATTTTAAGG 


TATAATAATC 


18540 


TPATAGAAAT 


AAAGAAAAGG 


AGGAAAGAGG 


ATGCCACAGA 


TTAGCAAAGA 


AGCCTTGATT 


19600 


GAGCAAATCA 


AAGATGGAAT 


CATCGTTTCT 


TGTCAGGCTC 


TTCCTCATGA 


ACCGCTTTAT 


18660 


ACAGAAGCGG 


GAGGGGTGAT 


TCCCTTGCTG 


GTCAAAGCGG 


CTGAGCAAGG 


TGGAGCAGTC 


18720 


GGTATCCGAG 


CAAACAGTGT 


TCGCGATATC 


AAGGAAATTA 


AGGAAGTCAC 


TAAACTTCCA 


18780 


ATCATTGGGA 


TTATCAAACG 


TGATTATCCA 


CCTCAGGAAC 


CCTTCATCAC 


GGCTACTATG 


18840 


AAAGAAGTTG 


ATGAATTGGC 


AGAACTGGAC 


ATCXJAGGTGA 


TTGCTCTGGA 


TTGTACCAAG 


18900 


CGTGAACGCT 


ACGATGGTTT 


GGAAATTCAA 


GAGTTCATTC 


GTCAGGTTAA 


GGAGAAATAT 


18960 


CCTAATCAGC 


TTTTGATGGC 


TGATACTAGT 


ATCTTCGAAG 


AAGGCCTAGC 


AGCTGTAGAA 


19020 


GCAGGAATTG 


ACTTTGTCGG 


AACAACCTTA 


TCAGGCTACA 


CATCCTACAG 


TCCAAAAGTA 


19080 


GACGGTCCAG 


ATTTTGAATT 


GATTAAGAAA 


CTCTGTGATG 


CTGGTGTAGA 


TGTCATTGCA 


19140 


GAAGGAAAAA 


TTCATACACC 


AGAACAAGCC 


AAACAAATCC 


TTGAATATGG 


AGTGCGAGGC 


19200 


ATCGTTGTTG 


GTGGCGCCAT 


TACTAGACCA 


AAAGAGATTA 


CAGAACGCTT 


CGTTGCTAGT 


19260 


CTTAAATAAG 


ATGTGAGGGG 


GAGTTTTATG 


TTTAAAGTTT 


TACAAAAAGT 


TGGAAAAGCT 


19320 
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TTTATGTTAC CTATAGCTAT ACTTCCTGCA GCAGGTCTAC TTTTGGGGAT TGGTGGTGCA 19380 

CTTTCAAACC CAACCACGAT AGCAACTTAT CCAATACTAG ACAATAGTAT TTTTCAATCA 19440 

ATATTCCAAG TAATGAGCTC TGCAGGAGAG GTTGTATTCA GTAATTTGTC ACTACTTCTC 19500 

TGTGTGGGAT TATGTATTGG CTTAGCGAAA CGAGATAAAG GAACCGCTGC GTTAGCAGGA 19560 

GTAACTGGTT ACTTAGTTAT GACTGCAACG ATCAAAGCTT TGGTAAAACT TTTTATGGCA 19620 

GAAGGATCTG CAATTGATAC TGGAGTTATT GGAGCATTAG TTGTCCGAAT AGTTGCCGTA 19680 

TATTTGCACA ACCGATATAA CAATATTCAA TTACCTTCCG CTTTAGGATT CTTTGGAGGT 19740 

TCACGCTTCG TTCCTATTGT TACATCGTTC TCTTCTATCT TGATTGGCTT TGTCTTCTTT 19800 

GTTATTTGGC CACCTTTCCA ACAACTTCTT GTTTCTACAG GTGGATATAT TTCTCAGGCG 19960 

GGTCCAATTG GAACTTTTCT ATATGGATTT TTAATGAGAC TTTCTGGAGC AGTAGGCTTA 19920 

CATC AT AT AA TTTACCCTAT GTTTTGGTAT ACTGAACTTG GTGGTGTTGA AACTGTTGCA 19980 

GG AC AAACAG TGGTTGGAGC TCAAAAAATA TTTTTTGCTC AATTAGCCGA TTTCGCCCAT 20040 

TCTGGATTAT TTACAGAAGG AACAAGGTTT TTTGCAGGTC GTTTCTCAAC AATGATGTTC 20100 

GGTTTACCGG CTGCCTGTTT AGCGATGTAC CATAGTGTTC CTAAAAATCG TCGTAAAAAA 20160 

TACGCGGGTT TGTTTTTTGG AGTTGCTTTA ACATCTTTTA TTACCGGTAT TACAGAACCA 20220 

ATTGAATTTA TGTTTCTATT CGTCAGTCCG GTTCTATATG TTGTTCACGC ATTCCTTGAT 20280 

GGTGTTAGCT TCTTTATTGC AGACGTCTTA AATATTTCAA TAGGAAACAC ATTTTCAGGA 20340 

GGTGTAATCG ATTTCACTTT ATTTGGAATT TTGCAGGGGA ACGCTAAGAC CAATTGGGTT 20400 

CTTCAGATTC CATTTGGACT TATTTGGAGT GTTTTGTATT ATATTATTTT TAGATGGTTC 20460 

ATTACTCAAT TCAACGTTCT AACGCCAGGG CGAGGAGAAG AAGTAGATTC TAAAGAAATT 20520 

TCTGAATCCG CAGATTCAAC TTCAAATACT GCAGATTATT TAAAACAGGA TAGCCTACAA 20580 

ATTATCAGAG CCTTGGGTGG ATCAAATAAT ATAGAAGATG TAGATGCTTG TGTGACACGT 20640 

TTACGTGTAG CTGTAAAAGA AGTTAATCAA GTTGATAAAG CACTTTTAAA ACAAATTGGT 20700 

GCAGTTGATG TCTTAGAAGT GAAGGGTGGC ATTCAAGCAA TCTATGGAGC AAAAGCAATC 20760 

TTATATAAAA ATAGTATTAA TGAAATTTTA GGTGTAGATG ATTAAGTACT TACTGACTTA 20820 

ATAAAAAACA GAGGAGAGTG ATGGATGAGT AGGATGAAAT GAAATCGCAT ACAAGAAATA 20880 

AAGAACTCAT TATCCAAGTT GGATACGCTT ATTACATAGG AGAATACAAA TGAAATTTAG 20940 

AAAATTAGCT TGTACAGTAC TTGCGGGTGC TGCGGTTCTT GGTCTTGCTG CTTGTGGCAA 21000 

TTCTGGCGGA AGTAAAGATG CTGCCAAATC AGGTGGTGAC GGTGCCAAAA CAGAAATCAC 21060 
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TTGGTGGGCA TTCCCAGTAT TTACCCAAGA AAAAACTGGT GACGGTGTTG GAACTTATGA 21120 

AAAATCAATC ATCGAAGCGT TTGAAAAAGC AAACCCAGAT ATAAAAGTGA AATTGGAAAC 21180 

CATCGACTTC AAGTCAGGTC CTGAAAAAAT CACAACAGCC ATCGAAGCAG GAACAGCTCC 21240 

AGACGTACTC TTTGATGCAC CAGGACGTAT CATCCAATAC GGTAAAAACG GTAAATTGGC 21300 

TGAGTTGAAT GACCTCTTCA CAGATGAATT TGTTAAAGAT GTCAACAATG AAAACATCGT 213 60 

ACAAGCAAGT AAAGCTGGAG ACAAGGCTTA TATGTATCCG ATTAGTTCTG CCCCATTCTA 21420 

CATGGCAATG AACAAGAAAA TGTTAGAAGA TGCTGGAGTA GCAAACCTTG TAAAAGAAGG 21480 

TTGGACAACT GATGATTTTG AAAAAGTATT GAAAGCACTT AAAGACAAGG GTTACACACC 21540 

AGGTTCATTG TTCAGTTCTG GTCAAGGGGG AGACCAAGGA ACACGTGCCT TTATCTCTAA 21600 

CCTTTATAGC GGTTCTGTAA CAGATGAAAA AGTTAGCAAA TATACAACTG ATGATCCTAA 21660 

ATTCGTCAAA GGTCTTGAAA AAGCAACTAG CTGGATTAAA GACAATTTGA TCAATAATGG 21720 

TTCACAATTT GACGGTGGGG CAGATATCCA AAACTTTGCC AACGGTCAAA CATCTTACAC 21780 

AATCCTTTGG GCACCAGCTC AAAATGGTAT CCAAGCTAAA CTTTTAGAAG CAAGTAAGGT 21840 

AGAAGTGGTA GAAGTACCAT TCCCATCAGA CGAAGGTAAG CCAGCTCTTG AGTACCTTGT 21900 

AAACGGGTTT GCAGTATTCA ACAATAAAGA CGACAAGAAA GTCGCTGCAT CTAAGAAATT 21960 

CATCCAGTTT ATCGCAGATG ACAAGGAGTG GGGACCTAAA GACGTAGTTC GTACAGGTGC 22020 

TTTCCCAGTC CGTACTTCAT TTGGAAAACT TTATGAAGAC AAACGCATCG AAACAATCAG 22080 

CGGCTGGACT CAATACTACT CACCATACTA CAACACTATT GATGGATTTG CTGAAATGAC 22140 

AACACTTTGG TTCCCAATGT TGCAATCTGT ATCAAATGGT GACGAAAAAC CAGCAGATGC 22200 

TTTGAAAGCC TTCACTGAAA AAGCGAACGA AACAATCAAA AAAGCTATGA AACAATAGTC 22260 

CTTAGTTATT CTATAAAAAG TAGTTTTTTA AAGAACCTAA GAGTGTATAC CCCCTTTTCC 22320 

CTCTACACAG ATAGTGTAAG AAAAGGGGGC TTTTGTTTAA AATGTAAGAA ACTGTCACGA 22380 

AATTAAAATG AAGTTCTTAC ATAAGCGAAT CATAAAAAAT TTCATTTTGA TTTTAAAACA 22440 

GTTCAAGAAA GTCAAAAAAT TATTCTATTT GAAAGAGAGG TGCCGACTCT GAAAGTCAAT 22500 

AAAATCCGTA TGCGGGAAAC AGTGATTTCC TACGCTTTCC TAGCACCAGT ATTATTCTTC 22560 

TTTGTCATCT TTGTGTTGGC TCCGATGGTG ATGGGCTTCA TTACAAGTTT CTTTAACTAC 22620 

TCAATCACTA AATTTGAGTT TGTAGGCTTG GATAACTATA TCCGTATGTT TAAAGATCCT 22680 

GTCTTTACAA AATCTCTGAT TAACACAGTT ATTTTGGTTA TTGGATCTGT ACCAGTTGTT 22740 

GTTCTATTCT CACTCTTTGT AGCATCTCAG ACCTATCATC AAAATGTCAT TGCCAGATCC 22800 

TTCTACCGTT TCGTCTTCTT CCTTCCTGTT GTAACGGGTA GTGTTGCCGT GACAGTTGTT 22860 
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TGGAAATGGA TTTATGACCC ACTATCAGGG ATTCTAAACT TTGTCCTTAA GTCCAGCCAC 
ATCATCAGCC AAAACATTTC TTGGTTGGGA GATAAAAACT GGGCATTGAT GGCGATTATG 
ATTATTCTCT TGACCACTTC AGTTGGTCAG CCCATCATCC TTTATATCGC TGCCATGGGC 
AATATTGACA ATTCACTGGT TGAAGCGGCG CGTGTTGATG GTGCAACTGA GTTTCAAGTT 
TTTTGGAAGA TTAAATGGCC AAGCCTTCTT CCAACAACTC TTTATATTGC AATCATCACA 
ACAATTAACT CATTCCAGTG TTTCGCCTTG ATTCAGCTTT TGACATCTGG TGGTCCAAAC 
TACTCAACAA GTACCTTGAT GTACTACCTT TACGAAAAAG CCTTCCAATT GACAGAATAC 
GGCTATGCCA ACACAATTGG TGTCTTCTTG GCAGTCATGA TTGCTATCGT AAGCTTTGTT 
CAATTTAAAG TACTTGGAAA CGACGTAGAA TACTAAAGAA AGGAGACAGC TATGCAATCT 
ACAGAAAAAA AACCATTAAC AGCCTTTACT GTTATTTCAA CAATCATTTT GCTCTTGTTG 
ACTGTGCTGT TCATCTTTCC ATTCTACTCG ATTTTGACAG GGGCATTCAA ATCACAACCT 
GATACAATTG TTATTCCTCC TCAGTGGTTC CCTAAAATGC CAACCATGGA AAACTTCCAA 
CAACTCATGG TGCAGAACCC TGCCTTGCAA TGGATGTGGA ACTCAGTATT TATCTCATTG 
GTAACCATGT TCTTAGTTTG TGCAACCTCA TCTCTAGCAG CTTATGTATT GGCTAAAAAA 
CGTTTCTATG GTCAACGCAT TCTATTTGCT ATCTTTATCG CTGCTATGGC GCTTCCAAAA 
CAAGTTGTCC TTGTACCATT GGTACGTATC GTCAACTTCA TGGGAATCCA TGATACTCTC 
TGGGCAGTTA TCTTGCCTTT GATTGGATGG CCATTCGGTG TCTTCCTCAT GAAACAGTTC 
AGTGAAAATA TCCCTACAGA GTTGCTTGAA TCAGCTAAAA TCGACGGTTG TGCTGAGATT 
CGTACCTTCT GGAGTGTAGC CTTCCCGATT GTGAAACCAG GGTTTGCAGC CCTTGCAATC 
TTTACCTTCA TCAATACTTG GAATGACTAC TTCATGCAAT TGGTAATGTT GACTTCACGT 
AACAATTTGA CCATCTCACT TGGGGTTGCG ACCATGCAGG CTGAAATGGC AACCAACTAT 
GGTTTGATTA TGGCAGGAGC TGCCCTTGCT GCTGTTCCAA TCGTCACAGT CTTCCTAGTC 
TTCCAAAAAT CCTTCACACA GGGTATTACT ATGGGAGCGG TCAAAGGATA ATACTCTGCG 
AAAATCTCTT CAAACTACGT CAGCTTCACC TTGCCATACT TAAGTATTGC CTGCGGTTAG 
CTTCCTAGTT TGTTCTTCAA TTTTCATTGA GTATAGGAAA ATCAATCTAT CAAGATACAG 
AAGTATATTT TATAGATTTA GAGAATATAG AGGTTATAAG TGTCTACAAA ATGGAGGGTA 
TGCAGTTACT TTATGAAGTT TTGTCAGACA CTTATAAACT TAAGAATGGT TTTAGTTAAC 
TATCAGAAAC GAAGGAAAGA GTATGATTTT TGACGATTTG AAAAACATCA CCTTTTACAA 
AGGGATTCAT CCTAATTTAG ACAAGGCTAT CGACTATCTC TACCAACATC GTAAGGATTC 



22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 
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TTTCGAATTA 


GGAAAGTATG 


ATATTGATGG 


AGATAAAGTC 


TTTCTAGTTG 


TTCAGGAAAA 


24660 


TGTCCTCAAT 


CAAGCTGAAA 


ATGATCAATT 


TGAGTATCAT 


AAGAACTATG 


CAGATTTGCA 


24720 


TTTGCTGGTA 


GAAGGACATG 


AATATTCGAG 


CTACGGTTCA 


CGTATCAAAG 


ACGAGGCAGT 


24780 


AGCATTCGAC 


GAAGCGAGTG 


ACATTGGCTT 


TGTTCATTGT 


CATGAACACT 


ACCCACTCTT 


24840 


GTTGGGTTAT 


CACAATTTTG 


CGATTTTCTT 


CCCAGGTGAG 


CCACATCAGC 


CAAATGGTTA 


24900 


TGCAGGCATG 


GAAGAAAAGG 


TTCGAAAATA 


TCTCTTTAAA 


ATTTTGATTG 


ATTAAAAATA 


24960 


GGATGAATTG 


TTTTTTTGTA 


AAGCTTTGAT 


AATACTCTAC 


CATGAAATTG 


ATCTTTGTGA 


25020 


GGTAGAGAAA 


TGAGAATAAA 


ATATTTAAAA 


ATTGGTATCT 


TCTAAGTATC 


CTGCAAGAGC 


25080 


TAGTTTCTTA 


GATGGACAGG 


GGATTACAGT 


TGATGAGATG 


GCTTGGATAA 


TTAGGGGCAT 


25140 


TGTGAATGCA 


TTGATTGGTA 


GATACATAAA 


ATTAGGTACT 


TATGCGGCTA 


AGTATGGTAT 


25200 


TAGTATGGCA 


CGCTCGATCT 


TAAGTAGGGT 


AGCTGCAACT 


GCAGCAGCAA 


GAGTAGGATT 


25260 


ACTGACCAAG 


ATTTCTGGAT 


GGATTTTACG 


AGTAGCTGTG 


AATGTAGCTG 


ATGTATATGG 


25320 


TAATTTTGCC 


AACAATATTG 


CTGCAGCTTG 


GGATGCATAT 


GATAAAATTC 


CTAACAATGG 


25380 


TCGTATAAAC 


TTTTAAAATG 


CGAGAATGAA 


ACCACTTTGT 


ATTTTTTTAT 


TGAATATGTT 


25440 


AGCTTGGACA 


GTGCTTGCAA 


TGATAATTCG 


TGGAGGGCTA 


GATGGATTTG 


ATAGGCATAC 


25500 


TTGGAGTACT 


ATTTTAATTG 


CGTCGCTGTT 


CGGGGTATAT 


GATTATAAGC 


CCATAGATAA 


25560 


AAATAGAAAA 


AAGTCCAAAA 


GAAAAAATAG 


ATTTGTTCAT 


GGTAGGGACT 


TATGAAAGCT 


25620 


TTACTGACAA 


AAAAGAAAAC 


AGTTTACAAA 


GAAAAATGAT 


GGAGGAGCAA 


ACATGGCACA 


25680 


AAAAGGAGTA 


AGCCTTATCA 


AGGCAGCATT 


TGATACAGAT 


AACTTTCTCA 


TGCGTTTTAG 


25740 


TGAGAAGGTC 


TTGGACATCG 


TGACAGCCAA 


TCTTCTTTTT 


GTCGTCTCTT 


GTTTACCCAT 


25800 


CGTGACGATT 


GGAGTGGCTA 


AAATCAGCCT 


CTACGAGACC 


ATGTTCGAAG 


TTAAGAAGAG 


25860 


CAGACGGGTG 


CCTGTTTTTA 


AAATCTATCT 


AAGATCTTTC 


AAGCAAAATC 


TGAAACTAGG 


25920 


TCTTCAGCTG 


GGTTTAATGG 


AGTTAGGAAT 


TGTGTTTCTT 


ACCCTTTCAtj 


All- 1 A. 1A1L1 




TTTCTGGGGT 


CAAACAGCTC 


TGCCCTTCCA 


ATTGCTGAAA 


GCCATTTGTT 


TAGGTATTCT 


26040 


GATTTTTCTT 


ACTATCGTGA 


TGCTGGCTAG 


TTACCCTATC 


GCGGCACGTT 


ATGACCTATC 


26100 


TTGGAAAGAA 


ATTCTTCAAA 


AAGGATTGAT 


GTTGGCTAGT 


TTTAACTTTC 


CTTGGTTCTT 


26160 


CCTCATGTTA 


GCCATTCTTG 


TCCTCATTGT 


GATGGTTCTT 


TATCTGTCCG 


CCTTCAGTCT 


26220 


ACTCTTAGGT 


GGCTCAGTCT 


TCCTACTTTT 


TGGGTTTGGA 


CTATTGGTCT 


TTATCCAGAC 


26280 


TGGATTGATG 


GAGAAAATTT 


TCGCAAAATA 


CCAATAGGAG 


CTTTATTTCT 


GAAACTACTT 


26340 


TCAAAGGCTC 


CAAACGCTAT 


TCTATAAGCG 


AGAAACTAAA 


ATCGG 




26385 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 

CCTGCCCGCA TTGCCCTAGG CATTAAGTAA ACATATAAAA GCATGTGAGA GACTGTTGGA 60 

AAAGCGAGGA AATTTCCCCT CTTTTCCTCT AGTCTCTCCT TTCTTTTGCT GATTTTATTC 120 

AAAGAAAATG ATATAATAGT AGTTATGGAG AAAAAGAAAT TACGCATCAA TATGTTGAGT 180 

TCAAGTGAGA AAGTAGCAGG ACAGGGAGTT TCAGGTGCTT ACCGTGAATT AGTTCGTCTT 240 

CTTCACCGTG CTGCCAAGGA CCAATTGATT GTTACAGAAA ATCTTCCAAT CGAGGCAGAT 300 

GTGACTCACT TTCATACGAT TGATTTTCCC TATTATTTAT CAACCTTCCA AAAGAAACGC 360 

TCAGGGAGAA AGATTGGCTA TGTGCATTTC TTGCCAGCTA CACTTGAGGG AAGTTTGAAA 420 

ATTCCATTTT TCTTAAAGGG AATTGTGAAA CGCTATGTAT TTTCTTTTTA CAACCGGATG 4 80 

GAGCACTTGG TTGTGGTCAA TCCTATGTTT ATTGAGGATT TCGTAGCAGC TGGTATTCCA 540 

CGTGAAAAAG TGACCTATAT TCCTAACTTT GTCAACAAGG AAAAATGGCA TCCTCTACCA 600 

CAAGAAGAGG TAGTCAGACT GCGCACAGAT CTTGGTCTTA GTGACAATCA GTTTATCGTA 660 

GTAGGTGCTG GGCAAGTTCA GAAACGTAAA GGGATTCATG ACTTTATCCG TCTGGCTGAG 720 

CAATTGCCTC AGATTACCTT TATCTGGGCT GGTGGCTTCT CTTTTGGTGC TATGACAGAT 780 

GCTTATGAAC ACTATAAGAA AATTATGGAA AATCCCCCTA AAAATTTGAT TTTTCCAGGC 840 

ATTGTATCGC CAGAGCGGAT GCGCGAATTG TATCCTCTAG CGGATCTTTT CTTGTTGCCT 900 

AGTTACAATG AGCTCTTTCC TATGACTATT TTAGAACCTC CGAGTTGTGA GGCTCCTATT 960 

ATGTTGCGTG ATTTAGATCT CTATAAGGTG ATTTTGGAGG GAAATTATCG GGCGACAGCG 1020 

GGTAGAGAAG AGATGAAAGA GGCTATTTTG GAATATCAAG CAAATCCTGC TGTCTTAAAA 1080 

GATCTCAAAG AAAAGGCTAA GAATATTTCC AGAGAGTATT CTGAAGAGCA TCTGTTACAA 1140 

ATCTGGTTGG ACTTTTATGA GAAACAAGCC GCTTTAGGGA GAAAGTAAAA AGTGAGGTAA 1200 

TCTATGCGAA TTGGTTTATT TACAGATACC TATTTTCCTC AGGTTTCTGG TGTTGCGACC 1260 

AGTATTCGAA CCTTGAAAAC AGAACTTGAA AAGCAGGGAC ATGCTGTTTT TATCTTTACG 1320 

ACGACAGATA AGGATGTCAA TCGCTACGAA GATTGGCAAA TTATCCGCAT TCCAAGTGTT 1380 



WO 98/18931 



PCT/US97/19588 



174 

CCTTTCTTTG CTTTTAAGGA TCGTCGCTTT GCCTACCCAG GTTTTAGCAA GGCACTTGAA 
ATTGCTAAAC AGTATCAGCT AGATATTATC CATACTCAGA CAGAATTTTC TCTTGGCCTG 
TTGGGGATTT GGATTGCGCG TGAATTGAAA ATTCCAGTCA TCCATACCTA TCACACCCAG 
TATGAAGACT ATGTCCATTA TATTGCTAAG GGGATGTTGA TCCGGCCGAG TATGGTCAAG 
TATCTGGTTA GAGGTTTCCT GCATGATGTG GATGGGGTTA TTTGCCCTAG TGAGATTGTC 
CGTGACTTGC TATCTGATTA TAAGGTCAAG GTTGAAAAAC GGGTCATTCC TACTGGGATT 
GAATTAGCCA AGTTTGAGCG TCCGGAAATC AAGCAGGAAA ATTTGAAAGA ACTGCGTAGT 
AAACTAGGGA TTCAAGATGG TGAAAAGACG TTGCTTAGTC TTTCGAGAAT CTCCTATGAA 
AAAAATATTC AAGCAGTTTT AGCAGCCTTT GCTGATGTTC TGAAAGAGGA AGACAAGCTT 
AAACTGGTAG TAGCTGGGGA TGGCCCTTAT CTGAATGACC TCAAAGAGCA AGCCCAGAAC 
CTAGAGATTC AAGACTCACT CATCTTTACA GGGATGATTG CTCCTAGTGA GACGGCTCTT 
TACTATAAAG CGGCGGATTT CTTCATTTCG GCATCGACAA GCGAAACGCA AGGTTTGACC 
TACTTGGAAA GCTTAGCCAG TGGAACACCT GTCATTGCTC ACGGAAATCC TTATTTGAAC 
AACCTCATCA GTGATAAAAT GTTTGGAACC TTGTACTATG GAGAACATGA TTTGGCTGGT 
GCTATTTTGG AAGCCCTGAT TGCAACACCA GACATGAACG AGCATACCTT ATCAGAGAAA 
TTGTATGAGA TTTCAGCTGA GAACTTTGGG AAACGAGTCC ATGAGTTTTA TCTGGATGCC 
ATTATTTCAA ATAACTTCCA GAAAGATTTG GCTAAACATG ATACGGTCAG TCAGCGTATC 
TTTAAGACAG TTTTGTATCT TCAGCAACAG GTGGTTGCTG TACCTGTAAA AGGATCTAGA 
CGCATGTTGA AGGCTTCAAA AACACAGTTG ATCAGTATGA GAGACTATTG GAAAGACCAT 
GAAGAATAGA AAGAGGAACA GCTATGAAAA AAACAATTAA TGAGAAGCCG TCGTGATAAA 
AAGATTGCGG GTGTTTGTGC TGGGGTGGCC CATTATCTGG ATATGGATCC GACTATCGTT 
CAAGTCATTT GGGGTGTTCT TACTTGCTGT TACGGAGCTC CAATTGTAGC TT AC ATT ATT 
TTATGGATTA TCGCGA 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13926 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2290 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2716 



txi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CTTTCGTTTT GCCTTATTCA AGACATGAGG GCCATCAGGA ATGATCTGAA ACTGCGAATC 



60 
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TGTTAACAGT 


CTATGGAGAG 


CTTTCATAGA ACTAAGATTC 


GGTTTATCTT 


TGCTGCCACA 


120 


AATTAGTAAG 


GTTGGATAAG 


GGTAAGTTCC 


TGCTATATCC 


GTTAAATCAA 


GTGTCTTCAA 


180 


CTCCTCAGAA 


ACTCCGACCA 


TAAGAGTCTT 


GTCTGCTCCC 


TGTTTTTCAA 


ATACTCTTTT 


240 


GGGAAGTAGT 


TTAAAAATCA 


GCAATTGAAG 


ATAAAATAGG 


ATATTCCCTG 


CTAATTTAAG 


300 


CGGGCATCCT 


GACAGAATCA 


AAGCTCGAAG 


ATTTGGTAAA 


TCGTAACTGG 


AAAGTTCTAG 


360 


TGTCAGGGCA 


GCACCTAAGG 


ACAATCCAAT 


CAAAACAAAA 


GGTTCTGTCT 


CTTGAGCTAG 


420 


GTGCTGATAA 


ACTCGCTCTT 


TAGCTTGTTG 


ATAGTTACTA 


ACTCCAGAAG 


GAAATAACTC 


480 


GATAGCCTCA 


GAAGGATAAT 


CTGTCAGTAG 


ATTCCGAACT 


TCTTTCCAAG 


ACTCTGCTGA 


540 


CTGCCCTAAC 


CCATGCAAAA 


ATATTAATTT 


CATCTAGTTC 


TCCTCAAGGC 


TTAATTCATA 


soo 


CAAGCCTCTC 


ACTGCATTAC 


AGCCGTAAAT 


AGCTTCTGCT 


TGGGTTAAAT 


CTGCCAAGGT 


660 


CAAGACTTTC 


TCTTCTACCT 


GTCCTGTTTC 


TAGCAAATGC 


TGACGGTAAA 


TTCCTGGCAA 


720 


GATTCCAAGT 


CGGATAGGCG 


GTGTGTAGAG 


TTTTCCAGCG 


ATTTTCAGAA 


CCAAATTTCC 


780 


TATAGAGGTT 


TCAAGCAGTT 


CTCCTGACTT 


ATTGTGGTAA 


ATCTTCTCTT 


GTTCTCCTAG 


840 


GCTCAAATGC 


GGTCGGTGAG 


TGGTTTTAAA 


GTAGGTAAAG 


GATTGATTCA 


AAGCAGCTTC 


900 


CTGAAGACAG 


ACTTGGGCCT 


GACAAAAGCT 


TGTACTGAGA 


GGGGTTAATA 


CTTGACGATT 


960 


GACTTCTATC 


TCTCCAGATT 


TGCTAAGGCT 


GATTCGCAAG 


CGGTAATCTC 


GATTAGCTTC 


1020 


ACAATCCTGA 


CACTCTTCCT 


CAATCTTGTG 


TCCCAAGTCT 


TCTGCATCAA 


AAGGAAAAGC 


1080 


AAAATAACGA 


CTAGCTTTTC 


TCAGCCTTTC 


CAGATGTTGT 


TCTTCAAACA 


TCAGTTGTTT 


1140 


TTGGCTGATT 


TTTCCAGTTG 


TAATTAATTG 


GAAGCGAGCT 


* 

TGTTTACGAT 


AGAGAACTGC 


1200 


TGCCTTTTGA 


TGAACCTCTC 


GGTATTCAGA 


TTCCCATGTG 


CTATCCCAAG 


TAATCCCTCC 


1260 


GCCAACTCCA 


TAAATGGCTT 


GACCTTTGTG 


AAGTTGAATG 


GTACGAATGG 


CCACATTAAA 


1320 


AATCCGTCGT 


CCATTTGGAA 


GCAAGAGACC 


AATCGTTCCA 


CAGTAGACTC 


CACGCGGTTG 


1380 


AGGCTCCAAG 


TCCTTGATAA 


TCTCCATTGT 


CGCAATTTTC 


GGTGCACCCG 


TTATGGAACC 


1440 


ACAAGGAAAG 


AGTGAGCGGA 


AGATTTCAAC 


AAGGTCCACA 


TCCTCTCGCA 


ACTGACTCTT 


1500 


GATGGTCGAA 


GTCATCTGCC 


AAACAGTTGA 


ATACTGCTCT 


ACCTGACACA 


GACGCTCCAC 


1560 


GTGCTCGCTC 


CCAACTTCAG 


AAATACGGTT 


CATATCATTG 


CGCAAGAGGT 


CCACAATCAT 


1620 


CATATTTTCA 


GAGCGATTTT 


TGGGATCCTG 


TTCCAACCAA 


CTGGCCTGTT 


CAAGATCTTC 


1680 


TTGGTCAGTT 


ACCCCACGCT 


GAGTCGTCCC 


CTTCATTGGT 


CGTGTTGTCA 


ACTCGCGATC 


1740 


ATTTTGCTCA 


AAAAAGAGCT 


CTGGGCTCAT 


GGAAATCACT 


GTCATCTCGT 


CATGTTCCAC 


1800 
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ATAGGCATTG TAGCCCGCCT CCTGCTCTAC CACCATACGA TTGTAGATGG CAAAAGGATT 1860 

GGCATTTAAC TTTTGCTTAA CTTGGACGGT GTAGTTGACC TGATAGGTAT CTCCCTGCCG 1920 

TAAATGATGG TGAATTTGGG CAATGGCCTT TTCATAGTCT GCTGCAGACG TTACTTCCTG 1980 

CCAATTTGAG GGCAAATCAA TATCCTCATA AGTCAGAGGA ATAGGGGAAG TTTCTACGAT 2040 

ATCATGAACA GTAAAGTAAA GCAGGTACTC TCCCAGTAGG GGATCCTTGT GAACTGCTAA 2100 

TTTTTCCTCA AAAGCAGGTG CAGCCTCGTA GCTGACATAC CCCACCACAT AATAACCTTG 2160 

CTCTTGGTAG CTTTCCACTT GTGCCAGCAA ATCTGCCACT TCTTCTACAT TTCTCGTTTT 2220 

CAACTCTTTA ATAGGCTGGG TAAAGGTATA TCTCTCCCCC AAAGTCCTAA AATCAATCAC 2280 

TGTTTTTCTA TGCATACCTT AAGTATAGCA TAAAATAAGA AAACCCTCAT CCGCAAAGCA 2340 

GATGAGAGAT TTCAATTATT TAAAGATTGA AGTTTTAAAG CTATTTGTTT GTTGAAGAAG 2400 

TTTCTTATAA ACAGCTTCTT TTAATTTAAC TGTATTATTC ATAGATACTG TTTTATTACC 2460 

GTTTGCTTCT TGTTTAAGAG TTTCGGCATC TTTTTTAACA GCTTCTTTAA ACAATGTCAG 2520 

TAAATCATCG TATGATGAAA CGGAAGAACC ATTTACTTCG AATGTTGTTA ATCCTTTCGT 2580 

TGCTTTATCT TTAACTTCTT TGAAGTAAGC TTTTTTAAAT TCTTCAATAG TATTAAATGT 2 640 

ATTGTTAGAT ATTTTCTTGA TAATATATTC ATCACTTAGA ACAGACTCAC CATCTGTTTT 2700 

AGATTGTTGT TTATATTTAT TTGAAGCATA ACCTAAGAAC CCATTTTCGT ATCCGTAGTA 2760 

ACCCCATAAT CTAAAAGCAT TATGTTTGAA TGAAACAGCT CCAGCAGCAC CTTTACTAGT 2820 

ATTACCTCCG TAGATACCGG TCATCATTCT AACACCTACA TAAGGTGATT GATCGTTATA 2880 

GCTAATTGCT TCGGGTTTAT AGATACCATT ACCTGGATTG CGATTAGTCA TTAATTGTTG 2940 

ATCAACTAAA TCATTAACAG ATTGAATATT TAATTCATTT TTCTCTTCTT GACTTAGATT 3000 

TCGAATTTTA TCCCATTGAT TTAATTTATT GTTATCACGG TATTCTCTAT CTATTTTTTT 3060 

GAACCATGCA CTATTTAAAT CTTTATTTTG TTGAGAAATC ACAGATTCAG CCTCAATTTC 3120 

ATCAAGAAGA GTTAAAGTGT CATTATAACC CTTCATATAT CTATTAATAT CTTCTCGTGT 3180 

TTTTAGAGTT TTTGGATCTG TAATATACCA CTGATTCCCA TCATTTTTGC GTTTAAATAC 3240 

CATATTAATA CCTAAAGAAC CAAACTCATC AAATCCACTA CCAGTAACAG GAGTTTGTAG 3300 

CATACCCTGA CCATATGCTT CAGCATCAGT ACCTTCACGG TGTCCAAAGC CACCTAAGTA 3360 

AATCGCACGG TCGTTGACGT GTGTTGTTTC ATGTGTGTAA ACTGAAATAC CGTATTCACC 3420 

AACCATTTCT AAATGAACAT ATTTTACATC AGTTCTAATA TCATCAGAGT TAGGATATAT 3480 

AGCAGCATAA GCTCCTGTTC CATTATAATT ATAATACTTA TCCATAGGAC CAAAGAATTC 3540 

TCTAAGAGGA GTATATACTT TGTCGGTATT ATAGCGGCCA TATTTTTCAA CCCATCCACC 3600 
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AGCAGCGTTA TAACCTTCCC AAATAGGAAT AACAGCATCT CTTACTAGTC GTTGTTTAAC 
GTTATCAGAC GCTAGACGAT ACCAGAAATC ATAATAGTTT CTATAACCAT CTGCAGCTTT 
GTTAACGATA TCTTTAATAT CTTCTAATGA TTTTTTACCT AATCGCTCTG CACTACCAAA 
GGCAATTGCA TTATAATTTG AAATTAAATA AAGATGTGCT TTATCAATAT TCAGTAGTGG 
GAGTATAGTA TTTCTAAGGT GACTTCGTTT TAAATTATCG AATGCACGAT GTTTAGAATT 
TTTAATTTCT TCGACCTCAG AAGCGCGTTC TGCGATGTAG ACATGGTCTT CTGTAGCATC 
AATAAACCAA TCGTTCATAT TGTCTATATT TGTGAACAAT TGTCTATTAT AATTTAAAAA 
TGCATCTAAA TTACCTGATT TAGTATATTT AGCCAATACT TGACCGAATG CGTCGAATGT 
ACGTGAACCT TTAATGTTGT TCTCTTTAGA ACCGATTTCA ATTAAT CTGT CTAATACGCT 
AACTTTTTCA CCATAGAAAT CTGGTTTGAA TAGCATTAAT TCTTTAATAT TAACATCACC 
AAATTTAACT CCATAGTAAC GATTTAGGTA AGTTAAACCT AGTAATAAAG CTGCTTTGTT 
TTTCTCGACT TTATCACGAA TCATTTGACG AGCAGCTGCA GAATCATTTA GTTGATGTTC 
TTCGTTTTGA ACTAATTTTG TGATTAGGTT TGTTAAGTTT TCTTTAACAT CTGTGAAGCT 
TTCTTCTAAA TATAAATCTT TGATTGCATT AACTCTATAG TCACCTAATC GATTTAGATG 
CTGATACATC GTTTGAGACT GAAGCTCTAC TGATTCTAAA ATAGATTTTA TATCATTAAC 
AAGAGTAGTG TTATCTTTTT GAACGATATT AGGTGTATAT TTAATTCCTA AGTCAGTTAT 
AGTATATTCT TTTACATTAC TTAAACCTTC ACTGCTACAA GACAAGTTAA AGTAATCTTT 
TGTACCCTCC GCATAGTGAA CAATAATTTT ATTAGCTTCA TCTACGTTTG TGATAAACTC 
ATTGTTGTTC ATCGCGGTAA CAGAAAGAAC TTCTTTAGTA TTTAGATGGT GTTCTTTATT 
TAATTTATTA CCTTGATATA CAATATAATC TTTATTGTAG AATGGTATTA ATTTTTCAAG 
ATTTTTATAG GCTTGGTTAT ATTCAGCGTT ATAATCTTGA ATACTAGAAT AGGCTTTTTC 
TTCATTAAGT TTTGCAAGAG GAGATAGATC ACTTTCTAAT TTATCAGCAG TAATATTGAA 
AGTAGTAACT TTAGCATCAG CTTGTTCTTT AGTTAATTTA GTAAATGTTT TAGATTTCCT 
AAATGATCTA TTACCTGACG AATATCCCTC TACCGCATAT AAATCTTTTA TATGAGCACT 
AGCATAATCA GAATCATCAA CGTCGTTAGA GCCGAATAAC TCCTCTCCAC GGATAATCTT 
AGCATAGCTG ACAGAATTAC TTACCGTACC TACAGGCCAA GTCTTACTTG CTATTGCTCC 
AACTTCTACT GGATTTGAAA CATCTATTTT ACCTTTTACA ACCGACTCAG TTAGGAGAGC 
TTTTGTACCA ATAAGATGGT CTAGAGTTAA TCCATAATCT ACTTTAGGAA CTAACAAGCT 
GGCGCGTGTT TTGTTTCCTG TAATAGTAGC ATCAACATAT GCTTTTCTAA CAATTCCTCT 



3660 

3720 

3780 

3040 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

43BO 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 
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ATAGTTTGTA CCTGCAATTC CCCCTGTATG AGAGCCATTT CCACTTGTAG AGTGTAGTTT 5400 

GCCAAAGAAA GCAACATTTT CAATACGAGT TCCATCATTC ATATTATTTA CAAATCCAGC 54 60 

AACATTATTA CGACCTGAAA GTGTGCCTGT AATTTTGACA TTTGTAATAA CTGAAGAACC 5520 

TTTCATAGTA TTGGCTAATG ATGCAATATT ATCTTGACCA GAACGTTCTA TCTCTACATT 5580 

TTCAAAATTC ACATTATTTA TCGTTGCGTT TGTTATCACA TTAAATAATG GATGTTCCAA 5640 

TTCAGTAATA GCAAATTGTT TTCCTTCAGA ACTTAAAAGT TTTCCTGTGA ATTCTTTAGT 5700 

GATATATGAT TTTCCATTAG GAACAACATT TCTAGCGCTC ATTGATTGTC CCAGACGATA 57 60 

TTCTTTTGAA GGATCGTTTT GAATAGCTTC CACTAATTCT TTGAAATTAT AATATACATT 5820 

ATCTTCGTGG ACTTTAGGTT TTTCAATATA CTGAACGTAT TCTTCTTCAA ATTTATTATC 5880 

AGCAGTTCTA GAGACTAAAT TGTCTGCGAT TGCTGTAACT TTATATACAG GTGTTCCGTT 5940 

AACCGTAGTT TCTTCTATAT TTTTAACAGC TAGTAATGTA GTTTTCTGAT TATTTGAAGT 6000 

TATTTTTAAA TAATAATTGC TCTTATCATC AGGAATAGTT GTTATCAGTG ATTCATTAGT 6060 

TTCTTTTCCA TTTTCGTATT TGATTAAATC TGTACGTTTA ATATTTTTAA GCTCAACTTT 6120 

TTTAAGATCT AATTGAATAT TTTGATTTTC TAGAGTTTCA GTTTCTTCAC CGTTACCTCT 6180 

GTCGTAAATC ATAGTTGTAG ATAGGGTGTA TTCTTTGTAG TACTCTAGGT TCTTAAATGC 6240 

AGCGCTTATA GTTTCTGTTG TTACCTTGTC ATCTGTAAGG ACTACAGTAT TAATAACTTC 63 00 

TTCTCCTTTT TTCAATTCAG CTGTGATTGA TTTGATTTTT GTTTTGTTTT GATTTTCTAG 6360 

AGTATACTTA GCAACAGCTT CACGTTCCAA TATTTTCTTA TCGGTACTAG TCAATGTTAA 6420 

TATTGGCTTT TCAGATAATT CAACCAATTT TTCAATAGTT GCAGTTAATT TTTCAACAGC 64 80 

TTCGTTAACT TCACTTTGTT TAGCATCTGT ATTAGCTCCA ACTTTTTCAG CCTTTGTAAC 6540 

TTCAGTTTGG AGGTTTTGCC AACTTCTATC ACTGTAATGT TCTTTTACCT TTGTTTTTGC 6600 

ATCTGCAATC GTATTGTTTA ATTCAGTTTT ATCAACGTTT AGAGCGTCAA TAGCCGTTTT 6660 

AAGTTTATTT GTCTCCCTAT TTACCTCAGG CTGTTTTACA GGCTCTGAAG CATAGACACC 6720 

TTTTGCAGTT TCTAAAACAG GTCCAAGAGC ATTGTAACTT GCTGTAGAAT AATCAGTAGG 6780 

AGAAACTGAA CTAGCTTTAT CAATTTGATT ATTTAACTCA CTTTTATCAA CTGGTTCTTT 6840 

AGTACCAATA CCCTTTATTT TATCTTCTGG TTTCGGTGTT TCCTCTACAG CCTTCTCTTC 6900 

TTCAGGAACT TCTGGTTGCT TTTCTGGCTC AACTGGTGCC GTTGGTGCCT GTTCGTCTTC 6960 

TCTTGGCGCG ACTGGTTCAC CTGCTTGTTC AACTTTTGGT TCCTCTGTTG GTTCTGTTTG 7020 

TTTTTCTACA GCAGGCGTTT CAACTTTTGG TTGTTCAATA GATTGATTAA CAGTCTCCTC 7080 

TTTTGGTTCT ACAGTTTCTT CAGCCTTGGT ATCTGGACTT GACTCTTCTT GTTTCGGTGT 7140 
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TTCCTCTACA GCCTTCTCTT CTTCAGGAGC TTCTGGTTGC TTTTCTGGCT CGACTGGTCC 7200 

CTTTTCGTCT TCTCTTGGCG CGACTGGTTC ACCTGCTTGT TCAACTTTTG ATTCCTCAGC 7260 

TGGTTTGTCT GATGGTTGAC TTTCTGGCTT AACTGCTACT TTTTCCTCTG GTTTTGACTC 7 320 

AACTTCTCCA CCTACTTCTT CAACTGGAGC TGGTTCTGCT GAATCTTCTT TCCCCTCTTC 7 380 

TACTTTAGGA AGGGTGTCGT CAGTAGGTTT TACCTCCGAT TTTGGTTCTT CCTTTGGACT 7440 

TTCTTCTGTT TTAGGTGCTT CTTCTTTTGG AGCTTCCTCT GTCTCTACTA CTTGGTTTTC 7500 

TGTCCTAGCT TGCTCCTGAT TTGTTATTGA TTGAGGAGTC TCAACTTCGA CCACAGTCAC 7 560 

CTCTCCAGGT TTTGCTGAGG TTTCTTCTAA AACAGTGTCC AAGCCAAGCG TTTTGAGGAT 7 620 

GTCACCTGAT AGATAACCAA CATAGCGATA GCCCTCCATT TCAACAACAC CCTCTCGACT 7 680 

AGCCAGCGCT AGGGTCGCAA CTGGGTCTAC AGCCCCTGCA CTAGGAAGAA CTACCAATCC 7740 

CATAGCTCCA ACTAGAAAGA CGCTAGCAAT TTTCTTTCTC TTGTAGATTA AAAGCAAGCT 7800 

* CCCAACAGTC AGCAAACCAA AAGCTGTCAA AACAGATGCT TCTGTCCCTG TTTGAGGCAA 7860 

CTGATCTTTT TGATACACCA AACCATATAC AACTTCATTC CTGTCAGGCT TTCCTGTCTG 7920 

AATTAAATCT TTAGCTTCTT GTGAAATAAT CTCTTTATTT ACATAGTGAT AGGTGGCTGC 7980 

GTCCACTACA GAAGGAGCCA TCAAAAGGCT TCCAAGAAAT ACAGAGCCTA CAACTCCCTT 8040 

AATCTTACGA ATTGAAAAAC GGTCTTTTTT AAACACTTTT ATCTCCTTTA TTCATTCTCA 8100 

AAACTTCCTA ATAGCATCTT GCGGATAGTG CGCACGCGCA CCTCCCATTA ATTTTCGACG 8160 

ACTAGCCAGT GCCGTTACAT GGGCATGACC AATCTCTCTC AAAATAGGGC GAATCGGAAC 8220 

CTGAACATGC TTGACATGCA TGCCAATTGC AGTGTCTCCG ATATCCAATC CAGCATGAGC 8280 

CTTGATAAAT TCAACCTCAA CTGGATCCTG CATAAACTTA AAGGCTGCCA ACTGCCCCGA 8340 

ACCTCCTGCA TGAAGAGTAG GATGGACACT GACAATTTCC AGACCAAACT GCTCTGCCAC 8400 

CTGACGTTCA ACAACGAGAG CCCGATTGAC ATGCTCACAA CCTTGAACTG CTAAATGGAT 8460 

ACCTCTACTA CCTAGAATAT CCAAGATAGT CTCCACTATC AGCTCACCAA TCTCTTGACT 8520 

GGATTCTTTC CCAATATGAC CACCTAGCAC CTCACTAGAA GATAGACCTA AAACAAAAAG 8580 

GGCCCCCTGC TTCAAATTGG TCTTTTCTAA AACATCTTCC ACTACCTGAC GTGTTTCTCT 8640 

TTGAATCTGT GTCTCGTTCA TCTCTGTTAC CTCTGTTGTC ACTCTTCTAT CATACCGTTT 8700 

TTTCTTGTTT TTAGCAAGAT AGACAACCTA GAAAGTTTGC CCAATTACGC ATAAAACTCC 8760 

CAGAATTGAC TGGGAGTTAG CTAGTTTCTA TTCTATTTAT ATATATTTCA ACTTTCGTCC 8820 

CTTTTTGGGG TCTAGAATCA ATCTTCATAT GGTAATTGGC TCCAAAATGA AGTTTGAGCC 8880 



WO 98/18931 



PCT/US97/19588 



130 



GTTGATCGAC 


ATTTTGAAGA 


CCAACTCCCC 


CACGTTTGAG 


TTGAe^TTGA 


CTACTATCAC 


3940 


CAGCATCTTG 


GAAGCCAACG 


CCATCATCCT 


CAATACGGAT 


GACCAATCCC 


GAATCCTGTT 


9000 


TCTGGACAGA 


AAGTTTAATA 


TGGCCCTGAC 


CTPCCTTTTC 


CTTAATGCCA 


TGGTAAAGAG 


9060 


CATTTTCTAC 


AAGGGGTTGT 


AGGACCAGCT 


TGGGTAAGAC 


TAAATTATCA 


AACGCAACAT 


9120 


TTTCATTAAT 


TTCGTATTCC 


AGCTTATCTC 


CATAGCGTTG 


TTTCTGGATA 


AAGAGATACT 


9180 


GGCGGACATG 


ATTGATTTCG 


TCAGAGAGAC 


AAATCAAGTC 


CTTGCCTTGA 


TTGAGCGCCA 


9240 


AGCGGAAATA 


GGTTGCCAAG 


GACTTGGTCA 


CCTGCACCAC 


TCGCTGACTA 


TCATGAAATT 


9300 


CAGCCATCCA 


GATGATGGTG 


TCCAAAGTGT 


TATAGAGGAA 


ATGTGGATTA 


ATCTGGCTCG 


9360 


AAAGGGCTTG 


AAGTTGGTAC 


TGACGGGTCG 


TTTCTTCCTG 


GCTACGAATA 


GCTACCATCA 


9420 


ACTGATCAAT 


CTGATCCAAC 


ATAGCATTAA 


ATTGGCGAGT 


TACTTCTCTC 


AGTTCATAGG 


9480 


CACCAACTTC 


CTTGGCACGA 


AGATTTTGAG 


CACCAGAAGC 


AATTTCCAAC 


ATGGTTTCTC 


9540 


TCAAATCCTT 


CAAAGGAGCA 


ATCCAGCGTT 


TAAGACTGAA 


CCACACTAAG 


CAGAGACAGA 


9600 


CAAGAAGAGA 


TGTGACACTG 


GCCCCAAGCA 


AGGTCCACAA 


GAGCTGACTC 


CGAACCTGGT 


9660 


CTAACTTTTC 


CAATGATGAC 


ACGCCAAGCA 


CCGTCCAATC 


AGTTCCTGCA 


ATCTTCTCTT 


9720 


GACTGACGTA 


GGATTTGTGA 


CCAGGAGTAT 


AACCCTGACC 


TGTATCGATG 


TAGGGTTTCA 


9780 


TAGCCTCCAT 


TTTGCTAGAC 


GAACTATAAA 


CTGTGTGTTG 


AGGATGGTAG 


ACAAATTCAT 


9840 


GGTTTTCATT 


GATAATGAAG 


GCAAAGCCCT 


GCTGCCCCAA 


CTGGAGTTCA 


TTGAGATAGG 


9900 


CTTCCAGAGT 


TTCATAAGAA 


ATATCCAAAC 


GAAGCACACC 


AAGATTGGCT 


CCCTTTGCAT 


9960 


CAACAAGTTC 


TTGAGTGACA 


GAAATGACCC 


ACTGACTATC 


TGATTTACGA 


GCTGGAGTCA 


10020 


AAACAGGCAT 


AGCTCCCTGA 


TGAATGGCCT 


TTTGGTACCA 


ATCCTCAGCC 


ATCATATCAG 


10080 


AGGAAGTTTT 


CATCTGCACA 


CTGTCATCTG 


TAGAAATGAC 


CTGACCAGAT 


TTGGTCACCA 


10140 


GCACAACAGT 


TTTCAAGTCC 


TTATCTGACT 


TCAAGATGGT 


CAAAAACAAA 


TCTCGGATTC 


10200 


CCTCGACCTT 


GTCTTGACTG 


GGATTCTCAG 


CATAGGCCAG 


AACATCCGTC 


TGCTGGGTCA 


102 60 


AACCAGTCGA 


GGTGGTTTCT 


AGTTTTTTGA 


TATAAGACTG 


AATAAAGTGG 


CTAGTCTGGC 


10320 


TGATGGTCGT 


TTGGCTGTTG 


CCCTCAATGG 


TGGCCTCAAT 


GGCTGAAGAA 


CTTGATTGAT 


10380 


AGTAGAAAGT 


TCCAACCAGA 


GCTAGGAGAA 


TGAGAAAGAC 


CAGAAAGATG 


GAAATAACCA 


10440 


TTCTAACTAA 


AAGAGAAGAA 


CGCTTCATCG 


GTCTTCTCCC 


TTCTTAAACT 


GACGAGGTGT 


10500 


CACACCTGCA 


ATCTGCTTAA 


AACGTTGGGT 


AAAATAGTTC 


ATATCTTCAA 


AACCAACCTT 


10560 


CTCTGCGATC 


TCATAAATCT 


TCAGATCTGT 


AGTTAAAAGC 


AAGAGCTTGG 


CTTGTTTAAC 


10620 


ACGTTCTCTC 


ACCAGATAAT 


CCTGAAAAGG 


CAAGCCCAAC 


TCTTTCTTAA 


TCAAGGAACT 


10680 
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CAGATAGGTC i 


GGACTAAAAC 


CTAAGTCACT 


GGCTAAAGAC 


TTTAAACTAA 


ATTGGCTATC 


10740 


AGCCAGATGA 


GACTGGATTT 


TCTGGGCCAT 


GTTTCCTTCA 


AACCTATTAG 


TCAATAAATC 


10800 


TTGTAACTGC 


TCTTCTTTCT 


CTTCCTTGTC 


TAGTTTTTGT 


ttgattitcc 


CCAACATTTC 


10860 


CTCAATATCC 


TGACGAGAAA 


AGGGTTTGAG 


CAGGTAGTCG 


TCCACACCTA 


GTTTGACAGC 


10920 


AGACAAGGCA 


TAATCAAAAT 


CATCGTAACC 


TGTTAAAAAG 


ACCAAATGAA 


CCTGAGGATA 


10980 


GGTTTCTCGT 


ACCAGACTGG 


CCAACTGGAT 


GCCATTTAGA 


TGAGGCATGT 


TGATATCGGT 


11040 


TAAAATGATA 


TCTGGCACCT 


GCTTTTGGAT 


CAATTCCCAA 


GCCTGCCTTC 


CATTTTCAGC 


11100 


CTGACCGATG 


ATTTCCATAT 


CGTAGGCTGC 


TACATTGACC 


AGTTTAGTCA 


AACCTTGTCT 


11160 


TACCAGATAT 


TCATCTTCTA 


CGATTAAGAT 


TGTGTAGGTC 


ATGCTCTGCT 


CCTTTACCAC 


11220 


TTACTAGTAT 


CAGTATAGCA 


AAATTCTCCT 


CTAACTGCTT 


AGGAAAGACC 


TCTTATACTC 


11260 


AATAAAAATC 


AAAAAGTAAA 


CTAGGAAGAT 


AGCCACAGGT 


TTCTCAAAGT 


ACCGCTTTCA 


11340 


GGTTGTAAAT 


AAAACTGACG 


AAGTCGACTC 


AAAGTATAGC 


TTTGAGGTTG 


TAGATAAAAC 


11400 


TGACGAACTC 


GATAACCCTA 


CATACGGTAA 


GGCGACGCTG 


ACGTGGTTTG 


AAGAGATTTT 


11460 


CGAAGAGTAT 


TAATCAACAT 


AATCTAGTAA 


ATAAGCGTAc 


CTTTTTCTTC 


CATTTGGTCT 


11520 


TTGGGAATAA 


AGCGGATAGA 


GAGGCTATTG 


ATACAGTAAC 


GTAAGCCGCC 


CTTGTCCTGT 


11580 


GGACCATCCG 


TAAAGACATG 


CCCAAGGTGA 


GAATCTCCTA 


CTCGGCTCCG 


CACTTCCATA 


11640 


CGCGTCATAT 


TGTAGGACTT 


ATCTTCCTTG 


TAGGTGACAA 


CATCTGGACT 


GATGCGTTGG 


11700 


GTAAAACTAG 


GCCAGCCACA 


ACCAGACTCA 


AATTTGTCTT 


TTGATGAAAA 


GAGAGCTTCC 


11760 


CCAGTTGCTA 


TATCCACATA 


GATACCGGAT 


TCAAATTTAT 


CCCAGTAACG 


GTTTGAGAAA 


11820 


GCTCGTTCTG 


TTTGATTTTC 


CTGGGTAACT 


GCATACTCCT 


CACGTGACAG 


GGTCTTTTTC 


11880 


AATTCCTCAT 


CACTTGGTTT 


TGGATATTTG 


CTGGCATCAA 


TGACAGGATA 


GGCCGCCTGA 


11940 


TTAACATTGA 


TATGGCACTA 


GCCATTTGGA 


TTTTTCTTGA 


GATAGTCTTG 


ATGGTAATCC 


12000 


TCAGCCACCA 


CAAAATTCTT 


CAAGTTTTCC 


TTTTCAACTG 


CTAGAGGTTG 


ATCGTATTTC 


12060 


TTAGCCACCT 


CATCAAAGAC 


TTGGTTAATC 


ACTTCCAAAT 


CCTTGTCATC 


TGTGTAATAA 


12120 


ACACCAGTAC 


GGTACTGGGT 


CCCCACATCA 


TTTCCTTCTT 


TATTTTTGCT 


GGTTGGATTG 


12180 


ATAATGCGGA 


AATAGTGAAG 


CAGGATTTCC 


TTGAGAGAAA 


TTTCCTTGGC 


ATCATAGGTG 


12240 


ACATGGACGG 


TTTCTGCATG 


ACCTGTTTGG 


TTAATCAATT 


CGTACTTGGT 


TGTTTCTCCT 


12300 


CTACCATTTG 


CATAGCCTGA 


. AACGGCATCC 


GTCACCCCGG 


GAACACGTGA 


GAAATATTCC 


12360 


TCCACTCCCC 


: AGAAACAACC 


TCCAGCTAGA 


> TAAATTTCGT 


GCAAGTCTGC 


: GTCTTTACTA 


12420 
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ATTTCTGTTT 

* ™ • • * x* * ^m* m 4 m 


TTTTCACTGC 


TTTTCCTCCT 


TGGCTAACTG 


CCGCCTTTTC 


AATTTGCGAG 


12480 


GCATCTGTCT 


GCCCTGCATT 


TCGTATCAAT 


AGAACATAGA 


AACCGGTTAT 


GGCTAGAAAA 


12540 


AATACTCCTA 


GCAACAAGAA 


GATTTTTAAC 


TTATCATTCA 


TAAGACGCCT 


CCTAGGCTAA 


12600 


TTCCTTCAAA 


GTTTGCAAAA 


TTGCATCTTT 


TTCCATGAAT 


CCTGGATGTG 


TTTTGACCAG 


12660 


CTTGCCTTCT 


TTGTCTATAA 


AGGCTTGGGT 


TGGGTAAGAA 


CGGACACCAT 


AAGTTTCCAA 


12720 


AAGTTTGCCT 

mwmrn^J AAA A 


GATGGGTCAA 


CTAGGACTGG 


GAGATTTTTA 


TAATCCAATC 


CCTTATACCA 


12780 


ATTCTTAAAG 


TCCGCTTCAG 


ATTGCTCTCC 


CTTATGTCCT 


GGTGACACTA 


CTGTCAAGAC 


12840 


CACATAGTCA 


TCACCAGCTT 


CTTTAGCAAT 


CTCATCCGTA 


TCTGGAAGAC 


TAGCCAGACA 


12900 


GATGGAACAC 


CAAGAAGCCC 


AGAATTTGAG 


ATAGACTTTC 


TTGCCCTTGT 


AATCAGATAA 


12960 




TTGCCATCTA 


CTCGCATCAA 


TTCAAAATCA 


GCCACCTCTT 


TCCCTTTAGC 


13020 


T(Vf! fTTfJTT 

A VJlv VJ X k \J A- K 


TTACTAGCTG 


TCTGCTCCGT 


CTTCATTTCA 


TCTTTCGTTT 


GGTGTTCACT 


13080 




TTGCCTGAAC 


AAGCCGTCAA 


ACAAAGGAGC 


GAACCTGCTC 


CAAGAACACA 


13140 


lull iv\_v_nl 


TTTTTCATAT 

1 A 1 A li A -** A 


TGATATTCCT 

A Url # 4 » A A ^» A 


TTCCATTTTA 


TTCAAATAAT 


TGACTTAAAA 


13200 


X 1 \irvi\iLn 1 1 


TCCAAACAGA 


ACCAAGAAGC 


CCATCACAAT 


AATGAGAAAA 


CCACCCACTT 


13260 


1 1 U unviurt 1 




GGATGAAGTT 


TTCGGAAATG 


TTTCAAAACA 


TAACTAGAGG 


13320 


Tr'AflAnfTAft 


AAGCAAGAAT 


GGTAGCGCCA 


AGCCCAGCGT 


ATACACCAAC 


ATGAGACCAG 


13380 




AGCTCCTGAA 


CCACCTGAAG 


CCGCCAAGGC 


CAAAACAGAC 


CCCAGAACCG 


13440 


rrrrrt car* a 


ju^rCTr caa 


GCAAAACTAA 


AGGTCAAGCC 


CAATAAAAAT 


GCCTGACTAT 


13500 


AGCCCTTACC 


ATTTTGCCCC 


TGTCCTTGCA 


GTTGTAGCCT 


CTTTTCCTTA 


TAAAGCCCCT 


13560 


TAAAGTGTAG 


AATCTCCATT 


TGGTGCAAAC 


CAAGAAGGAT 


AATAATTGCC 


CCAGTAAGAT 


13620 


ATTGGAACCA 


AGAAGCATAA 


AGCAAATCGC 


CTAAAAAACC 


AGCTCCATAG 


CCCAACAAAA 


13680 


TAAATATAAA 


GGAAATTCCT 


GCTATAAAGG 


CCAGAGTTCG 


TAATAAACTA 


GTAACTGAGA 


13740 


TTGAAAATTT 


GCCGCTAGAA 


GCCTGAGCAC 


CATCCTTATC 


ATCTAGTAAC 


ACTCCTGTAT 


13800 


AGACCGGTAA 


CAAAGGTAAG 


ATACAAGGAG 


AAAAGAAGGA 


TAGAATCCCT 


GCCAAAAAGA 


13860 


CACTTAGAAA 


AAAGAAAATA 


TGACCCATAA 


AGTTCCTCCT 


ATCATTTTAT 


TGATAGATTT 


13920 


ATTATA 












13926 



(2) INFORMATION FOR SEQ ID NO: 6: 

Cxi SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 20199 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



« 



CCCAGCAGAA 


AAATGGCATT 


TGGAGATAAT 


GGAAATCGTA 


AAAAAACTAT 


GTTTGAGAAA 


60 


ATAACCTTGT 


TTATCGTGAT 


TATCATGCTA 


GTAGCAAGTT 


TATTGGGAAT 


TTTTGCAACT 


120 


GCAATTGGTG 


CCCTCAGTAA 


TCTATAAAAT 


AGATTCAAGA 


AAATTTAGTG 


ACTGGGATTT 


180 


CCCAGCCCTT 


TTTTAAAGTG 


AGAAGAAATA 


ATGAGTATGT 


TTTTAGATAC 


AGCTAAGATT 


240 


AAGGTCAAGG 


CTGGTAATGG 


TGGCGATGGT 


ATGGTTGCCT 


TTCGTCGTGA 


AAAATATGTC 


300 


CCTAATGGAG 


GCCCTTGGGG 


TGGTGATGGT 


GGTCGTCGAG 


GCAATGTGGT 


CTTCGTTGTA 


360 


GACGAAGGAC 


TACGTACCTT 


GATGGATTTC 


CGCTACAATC 


GTCATTTCAA 


GGCTGATTCT 


420 


GGTGAAAAAG 


GGATGACCAA 


AGGGATGCAT 


GGTCGTGGTG 


CTGAGGACCT 


TAGAGTTCGA 


480 


GTACCACAAG 


GTACGACTGT 


TCGTGATGCG 


GAGACTGGCA 


AGGTTTTAAC 


AGATTTGATT 


540 


GAACATGGGC 


AAGAATTTAT 


CGTTGCCCAC 


GGTGGTCGTG 


GTCGACGTGG 


AAATATTCGT 


600 


TTCGCGACAC 


CAAAAAATCC 


TGCACCGGAA 


ATCTCTGAAA 


ATGGAGAACC 


AGGTCAGGAA 


660 


CGTGAGTTAC 


AATTGGAACT 


AAAAATCTTG 


GCAGATGTCG 


CTTTAGTAGG 


ATTCCCATCT 


720 


GTAGGGAAGT 


CAACACTTTT 


AAGTGTTATT 


ACCTCAGCTA 


AGCCTAAAAT 


TGGTGCCTAC 


780 


CACTTTACCA 


CTATTGTACC 


AAATTTAGGT 


ATCGTTCGCA 


CCCAATCAGG 


TGAATCCTTT 


840 


GCAGTAGCCG 


ACTTGCCAGG 


TTTGATTGAA 


GGGGCTAGTC 


AAGGTCTTGG 


TTTGGGAACT 


900 


CAGTTCCTCC 


GTCACATCGA 


GCGTACACGT 


GTTATCCTTC 


ACATCATTGA 


TATGTCAGCT 


960 


AGCGAGGGCC 


GTGATCCATA 


TGAGGACTAC 


CTAGCTATCA 


ATAAACAGCT 


GGAGTCTTAC 


1020 


AATCTTCGCC 


TCATGGAGCC 


TCCACAGATT 


ATTGTAGCTA 


ATAAGATGGA 


CATGCCTGAG 


1080 


AGTCAGGAAA 


ATCTTGAAGA 


CTTTAAGAAA 


AAATTGGCTG 


AAAATTATGA 


TGAATTTGAA 


1140 


GAGTTACCAG 


CTATCTTCCC 


AATTTCTGGA 


TTGACCAAGC 


AAGGTCTGGC 


AACACTTTTA 


1200 


GATGCTACAG 


CTGAATTGTT 


AGACAAGACA 


CCAGAATTTT 


TGCTCTACGA 


CGAGTCCGAT 


1260 


ATGGAAGAAG 


AAGCTTACTA 


TGGATTTGAC 


GAAGAAGAAA 


AAGCCTTTGA 


AATTAGTCGT 


1320 


GATGACGATG 


CGACATGGGT 


ACTTTCTGGT 


GAAAAACTCA 


TGAAACTCTT 


TAATATGACC 


1380 


AACTTTGATC 


GTGATGAATC 


TGTCATGAAA 


TTTGCCCGTC 


AGCTTCGTGG 


TATGGGGGTT 


1440 


GATGAAGCCC 


TTCGTGCGCG 


TCGAGCTAAA 


GATGGGGATT 


TGGTCCGCAT 


TGGTAAATTT 


1500 


GAGTTTGAAT 


TTGTAGACTA 


GGAGACTGGT 


ATGGGAGATA 


AACCGATATC 


TTTCCGAGAT 


1560 


GCGGATGGTA 


ArmVTTTC 


CGCCGCAGAC 


GTTTGGAATG 


AAAAGAAATT 


GGAAGAACTA 


1620 
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TTT A A TCGTC 




TCGTGCCTTG 


AGATTGGCAC 


GAACTAAAAA 


GGAAAATCCA 


16S0 


TCTCAGTAAA 


GAAGCTAAAA 


AATCCCGTGC 


CTCATCAGAC 


ACGGGATTTT 


GTGGTACGAC 


1740 


aggcatgtat 


AGCAAACTGA 


ATCTGGAATA 


GCACAGCATA 


TCTTCTAAAA 


TATAGTAAAA 


1800 


TG & & ft TG AG A 


A P RfiG AC AAA 


TCGATCAGGA 


CAGTAAAATC 


GATTTCTAAC 


AATGTTTTAT 


1860 


AAGCAGAGAT 




AGTTTCAATC 

«Ax^ AAA ^»**A * * 


AACTATATTG 


TTATAAATTG 


ATTTGAATTT 


1920 


U.AAAAJ. 1 r\s\f*i 


TTG TfTG ATT 


CTTATTTCAA 

1m A. 4 *» AAA * *• » 


TTTGTTATAG 


TATATCTGAT 


GTCAAAGTTC 


1980 


TCGGCGAGTC 


AAATAGCGAT 


TCCCAAGCCT 


GACTATCGTG 


AGGTAGCGGA 


TTAAAATGGT 


2040 




A Cf*G TTTT AA 


GTCTGACGCT 


GGAAATAAGA 


ATTGTCAGAA 


GAAGGGATAG 


2100 




crrcTACGAA 


CAGGAACGTG 


ATAATAAGGC 


GTATATAGCG 


GATAAGAGGG 


2160 


I.AIA.AAAI. 1 


1 Aniiu 1 


AAAGGTAGTC 


GTAACCTATA 


TGCGTAAATC 


ACGAGAGTAA 


2220 




HL i Anu/\ ill 


TCTATTTTCA 


CTGTAACCTT 


TTAACGCCCT 


TATATCTTGT 


2280 


ATACACGAW* 




nACTTATCCC 


GTGAGGTCTA 


TCACTATAAA 


GAGAAAACGA 


2340 


L AtiA I AL>AAvj 


TP A TPPTG AG 


TCACfinTTAT 


CTGTCTGATA 


GGACGGTATG 


TATAAAACGC 


2400 


11L lu I uAAU 


TVTAftAGAAGG 


GGGAGAAGTT 


CTTGCTAAAA 


TTTAGTTGAA 


CAGCCGTATT 


2460 


CCGATAt; A 1 A 


OA 1 AAv» A*jA 1 


CTArrTCTTAG 


CTCCTACTCA 


GTTTTAGGGG 


ATAAAAAAGG 


2520 


GGCAATAW-ti 


All ^VlftUftrtA 


GATTATACTC 


TTCGAAAATC 


TCTTCAAATC 


ACGTCAATAT 


2580 


r*r* ^* WPT rp/-»<-» 

(.AA-V- 1 i Vj 1 


T ft TV **IV^T A ("2.(^1 


ATACTGACTA 


CGTCAGTTCC 


ATCTACAACC 


TCAAAACAGT 


2640 


^*Tt^^^t^^^^ ^ ^^^^ A 

GTTTT\»Atit.A 


AV- C 1 VA.VjOV» 1 




T^TGATCTTT 

A a A UT* A A * A 


GATTTTCATT 


GAGTATTACT 


2700 


AA 1 1 W Au 1 i A 


P*P A IfTPITPr 




TATCCAATAA 


AATTCAAAAG 


GATGGAAAAA 


2760 




T ATG AT AT AC 


TTTATTTTGA 


AGACCTTATT 


AGAAATCTTG 


AAAGAGTATT 


2820 


VjAAAAvI IAV» 


A A1Y2 AGAAAA 


ATTGTTATCA 


ATGGTGGATT 


ACCACTGCAA 


GGTGAAATCA 


2880 


L. 1 A 1 1 Au i uu 


TWT A AA AAT 


AGTGTCGTTG 


CCTTAATTCC 


AGCTATTATC 


TTGGCTGATG 


2940 


ATGTGGTGAC 


TTTGGATTGC 


GTTCCAGATA 


TTTCGGATGT 


AGCCAGTCTT 


GTCGAAATCA 


3000 


TGGAATTGAT 


GGGAGCTACT 


GTTAAGCGTT 


ATGACCATGT 


ATTGCAGATT 


GACCCAAGAG 


3060 


GTGTTCAAAA 


TATTCCAATG 


CCTTATGGTA 


AAATTAACAG 


TCTTCGTGCA 


TCTTACTATT 


3120 


TTTATGGGAG 


CCTCTTAGGC 


CGTTTTGGTG 


AAGCGACAGT 


TGGTCTACCG 


GGAGGATGTG 


3180 


ATCTTGGTCC 


TCGTCCGATT 


GACTTACACC 


TTAAGGCGTT 


TGAAGCTATG 


GGTGCCACTG 


3240 


CTAGCTACGA 


GGGAGATAAC 


ATGAAGTTAT 


CTGCTAAAGA 


TACAGGACTT 


CATGGTGCAA 


3300 


GTATTTACAT 


GGATACGGTT 


AGTGTGGGAG 


CAACGATTAA 


TACGATGATT 


GCTGCGGTTA 


3360 


AAGCAAATGG 


TCGTACTATT 


ATTGAAAATG 


CAGCCCGTGA 


ACCTGAGATT 


ATTGATGTAG 


3420 
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CTACTCTCTT GAATAATATG GGTGCCCATA TCCGTGGGGC AGGAACTAAT ATCATCATTA 3480 

TTCATCGTGT TGAAAGATTA CATGGGACAC GTCATCAGGT GATTCCAGAC CGCATTGAAG 3540 

CTGGAACATA TATATCTTTA GCTGCTGCAG TTGGTAAAGG AATTCGTATA AATAATGTTC 3600 

TTTACGAACA CCTGGAAGGG TTTATTGCTA AGTTGGAAGA AATGGGAGTG AGAATGACTG 3660 

TATCTGAAGA CAGCATTTTT GTCGAGGAAC AGTCTAATTT GAAAGCAATC AATATTAAGA 3720 

CAGCTCCTTA CCCAGGCTTT GCAACTGATT TGCAACAACC GCTTACCCCT CTTTTACTAA 3780 

GAGCGAATGG TCGTGGTACA ATTGTCGATA CGATTTACGA AAAACGTGTA AATCATGTTT 384 0 

TTGAACTAGC AAAGATGGAT GCGGATATTT CGACAACAAA TGGTCATATT TTGTACACGG 3900 

GTGGACGTGA TTTACGTGGG GCCAGTGTTA AAGCGACCGA CTTAAGAGCT GGGGCTGCAC 3960 

TAGTCATTGC TGGGCTTATG GCTGAAGGTA AAACTGAAAT TACCAATATC GAGTTTATCT 4020 

TACGTGGTTA TTCTGATATT ATCGAAAAAT TACGTAATTT AGGAGCGGAT ATTAGACTTG 4080 

TTGAGGATTA AACCGTAGAG GTGTTTATGA ATATTTGGAC CAAATTAGCA ATGTTTTCTT 4140 

TTTTTGAAAC GGATCGCTTG TATTTGCGTC CTTTCTTTTT TAGTGATAGT CAGGACTTCC 4 200 

GCGAGATAGC TTCAAATCCA GAAAATCTTC AATTTATTTT CCCAACGCAG GCAACTCTGG 4260 

AAGAAAGTCA ATATGCACTG GCCAATTACT TTATGAAGTC CCCTTTGGGA GTGTGGGCAA 4 320 

TTTGTGACCA GAAAAATCAA CAAATGATTG GTTCTATTAA ATTTGAGAAG TTAGATGAAA 4 380 

TCAAAAAAGA AGCTGAGCTT GGCTATTTTT TGACAAAAGA TCCTTGCTCC CAAGCATTTA 4440 

TGACAGAGGT TGTTAGAAAA ATTTGTCACC TTTCTTTTGA GGAATTTGGC TTAAAACAAT 4500 

TATTTATCAT TACCCACCTT GAAAATAAAG CTAGCCAAAG AGTTGCTCTT AAGTCTGGAT 4560 

TTAGTTTGTT CCGTCAGTTT AAGGGAAGTG ATCGTTACAC AAGAAAAATG CGGGATTATC 4620 

TTGAATTTCG GTATCTAAAA GGAGAGTTCA ATGAGTAAGC ATCAGGAAAT TCTAAGCTAT 4 680 

TTGGAGGAAT TACCAGTAGG TAAAAGGGTC AGTGTTCGTA GCATTTCGAA TCATCTAGGA 4740 

GTTAGTGATG GAACAGCCTA TCGGGCTATT AAAGAAGCTG AAAACCGTGG AATTGTGGAG 4800 

ACCCGTCCTA GAAGTGGAAC AATTCGTGTT AAATCCCAGA AAGTTGCTAT AGAGAGATTA 4 8 60 

ACGTTTCCTG AAATTGCAGA AGTGACTTCT TCTGAGGTTC TGGCTGGGCA AGAAGGTTTA 4920 

GAGAGAGAAT TTAGTAAGTT TTCAATTGGT GCCATGACTG AACAAAATAT CTTGTCTTAC 4 980 

CTTCATGATG GGGGGCTCTT GATTGTCGGA GACCGAACCC GTATTCAGTT GCTAGCCTTG 5040 

GAAAATGAAA ATGCAGTTCT GGTTACAGGG GGATTTCAGG TTCATGATGA TGTGCTTAAA 5100 

CTGGCCAATC AAAAAGGGAT rCCTGTTCTA AGAAGTAAGC ATGATACCTT TACCGTCGCG 5160 



WO 98/18931 



PCI7US97/19588 



186 

ACCATGATCA ATAAAGCCTT GTCAAATGTC CAAATCAAGA CTGATATTCT GACAGTTGAG 5220 

AAACTTTATC GCCCTAGTCA TGAGTATGGT TTTCTGAGAG AGACAGATAC AGTTAAAGAT 5280 

TATTTGGACT TGGTTCGTAA GAATCGTAGC AGCCGTTTCC CTGTTATCAA TCAACATCAG 5340 

GTCGTTGTTG GTGTTGTAAC CATGAGAGAC GCTGGTGATA AATCACCAAG CACGACAATT 5400 

GATAAGGTTA TGTCTCGTAG TCTATTTTTG GTTGGATTAT CGACAAATAT TGCCAATGTG 5460 

AGTCAACGGA TGATCGCAGA AGACTTTGAA ATGGTACCAG TTGTTCGAAG CAATCAAACT 5520 

TTGCTTGGCG TTGTGACGCG ACGAGATGTC ATGGAGAAGA TGACCCCTTC CCAAGTTTCG 5580 

GCTCTACCAA CTTTTTCTGA GCAGATTGGA CAAAAGCTCT CTTATCACCA TGATGAAGTA 5640 

GTCATTACAG TGGAACCCTT TATGCTAGAA AAAAATGGAG TTTTGGCTAA TGGTGTATTG 57 00 

GCAGAAATTC TGACCCACAT GACCCGATTT AGTTGTTAAT AGTCGTCGCA ATCTCATTAT 5760 

CGAGCAGATG CTGATCTACT TTTTGCAGGC TGTTCAGATA GATGATATAT TGCGCATTCA 5820 

GGCACGGATT ATTCATCATA CGAGACGGTC AGCTATAATT GATTACGATA TTTATCATGG 5880 

TCACCAGATT CTTTCAAAAG CAAATGTGAC TGTTAAAATT AATTAGAAAC TAGGAGAAAA 5940 

GATGATAACA TTAAAATCAG CTCGTGAAAT CGAAGCTATG CACAAGGCTG GTGATTTTCT 6000 

AGCAAGTATT CATATAGGCT TACGTCATTT GATTAAGCCA GGCGTAGATA TGTGGGAAGT 6060 

TGAAGAATAT GTCCGCCGTC GTTGTAAAGA AGAAAATTTC CTTCCACTTC AGATTGGGGT 6120 

TGACGGTGCC ATGATGGACT ATCCTTATGC TACCTGTTGC TCTCTTAACG ATGAAGTGGC 6180 

TCACGCTTTC CC7CGTCAT? ATATCTTGAA AGATGGTGAT TTGCTCAAAG TTCATATGGT 6240 

TTTGGGAGGT CCCATTCCTA AATCTGACCT AAATGTCTCA AAATTAAACT TCAACAATGT 63 00 

TGAACAAATG AAAAAATACA CTCAGAGCTA TTCTGGTGCT TTAGCAGACT CATGTTGGGC S3 60 

TTATGCTGTT GGTACACCGT CCGAAGAAGT CAAAAACTTG ATGGATGTAA CCAAAGAAGC 6420 

TATGTACAAG GGTATTGAGC AAGCTGTTGT TGGAAATCGT ATCGGTGATA TCGGTGCGGC 6480 

TATTCAAGAA TACGCTCAAA GTCGTCGTTA CGGTGTAGTG CGTGATTTGG TTGGTCATGG 6540 

TGTTGGCCCA ACTATGCACG AAGAACCAAT GGTTCCTAAC TATGGTATTG CAGGTCGTGG 6600 

ACTCCGTCTT CGTGAAGGAA TGCTCTTAAC CATTGAACCA ATGATCAATA CAGGCGATTG 6660 

GGAAATTGAT ACAGATATGA AAACTGGTTG GGCGCATAAG ACCATTGACG GTGGATTGTC 6720 

ATGTCAGTAT GAACACCAAT TTGTCATTAC GAAAGATGGA CCTGTTATCT TGACTAGCCA 6780 

AGGTGAAGAA GGAACTTATT AATAAAAAGT GAAAAGACTA CTGGAACTTT ATTTTGATAA 6840 

AAAATCCACT AGATCTTTTC ATAATAAAAC GCATTGTATC AAGTGTTAGG GGCTGATATC 6900 

ATGCGTTTTT CTGCTTTTAA GATTTTTTCC AACTCTGTTT GTAAGCGCAt CATAACAAAG 6960 
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GGTCTAGGAT TCAGGGCTCT CCTCCTATAT ACTATTAGTA AAGTAAAACT AAGGGAGGAT 7020 

ATTTTAGTGT CGCAGTCTAT TGTTCCTGTA GAGATTCCAC AATATTGTCG TTTTGATTCT 7080 

AAAAAGAGAA ATGGAATTCT GTTTAATGTT CGTATTGCCA ATCTTAAATT TACTTTTTTA 7140 

TATTATACTT CCTGCGAAAC AAAATATGGT ATAGTAGTTC TATGAATGAT GAAGCAAGTA 7200 

AACAACTAAC TGATGCACGA TTTAAGCGTC TTGTTGGTGT TCAGCGTACC ACTTTTGAAG 7260 

AGATGTTAGC TGTATTAAAA ACAGCTTATC AACTTAAACA CGCAAAAGGT GGACGAAAAC 7320 

CTAAATTAAG CCTAGAAGAC CTTCTTATGC CCACTCTTCA ATAGTGCGAG AATATCGAAC 7380 

TTATGAAGAA ATTGCGGCTG ATTTTGGTAT TCACGAAAGC AACTTTATCC GTCGGAGCCA 7440 

ATGGGTTGAA ATAACTCTTG TTCAAAGTGG TTTTACGGTT TCAAGAACTC CTCTCAGTTC 7500 

TGAGGACACG GTAATGATTG ATGCGACGGA AGTAAAAATC AATCGCCCTA AAAAAACAAT 7560 

TAGCGAATGA TTCTGGTAAA AAGAAATTTC ACGCTATGAA GGCTCAAGCG ATTGTCACAA 7620 

GTCAAGGGAG AATTGTTTCT TTGGATATCG CTGTGAACTA TAGTCATGAT ATGAAGTTGT 7680 

TCAAAATGAG TCGTAGAAAT ATCGAACAAG CTGGTAAAAT CTTGGCTGAC AGTGGTTATC 7740 

AAGGGCTCAT GAAGATATAT CCTCAAGCAC AAACTCCACG TAAATCCAGC AAACTCAAGC 7800 

CGCTAACAGC TGAAGATAAA GCCTATAACC ATGCGCTATC TAAGGAAAGA AGCAAGGTTG 7860 

AGAACATCTT TGCCAAAGTA AAAACGTTTA AAATATTTTC AACAACCTAT CGAAATCATC 7 920 

GTAAACGCTT CGGATTACGA ATGAATTTGA GTGCTGGTAT TATCAATCAT GAACTAGGAT 7980 

TCTAGTTTTG CAGGAAGTCT ATTGAGGTAT TGAGCTAGTT TATGAAAAAA TTGGGTGAAA 604 0 

AGTCGAGTGT TTTAGAAACC CACAGTGTAG TATTCTAGTT TCAATCCACT ATATTTTGCT 8100 

ACTCCCCGTA AAGTTTCTAT TTTCCCTGAT TTCTGATATA ATAGAAATAT TGACTTCAAG 8160 

AGTAAGGAAG AGAAGATGAA CGCATTATTA AATGGAATGA ATGACCGTCA GGCTGAGGCG 8 220 

GTGCAAACGA CAGAAGGTCC CTTGCTAATC ATGGCAGGGG CTGGTTCTGG AAAGACTCGT 8280 

GTTTTGACCC ACCGTATCGC TTATTTGATT GATGAAAAGC TGGTCAATCC TTGGAATATC 8340 

TTGGCCATTA CCTTTACCAA CAAGGCTGCG CGTGAGATGA AAGAGCGTGC TTATAGCCTC 8400 

AATCCAGCGA CTCAGGACTG TCTGATTGCG ACCTTCCACT CCATGTGTGT GCGTATTTTG 8460 

CGTCGCGATG CGGACCATAT TGGCTACAAT CGTAATTTTA CAATTGTGGA TCCTGGTGAA 8520 

CAGCGAACGC TCATGAAACG TATTCTCAAA CAGTTGAACT TGGACCCTAA AAAATGGAAT 8580 

GAACGAACTA TTTTGGGGAC CATTTCCAAT GCTAAGAATG ATTTGATTGA TGATGTTGCT 8640 

TATGCTGCCC AAGCTGGCGA TATGTATACG CAAATTGTGG CCCAGTGTTA TACAGCCTAT 8700 
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CAAAAAGAAC TTCGTCAGTC TGAATCCGTT GACTTTGATG ATTTGATTAT GCTGACCTTG 8760 

CGTCTCTTTG ATCAAAATCC TGATGTTTTG ACCTACTACC AGCAAAAATT CCAATACATC 8820 

CACGTTGATG AGTACCAAGA TACCAACCAC GCTCAGTACC AATTGGTCAA ACTCTTGGCT 8880 

TCCCGTTTTA AAAATATCTG TGTGGTTGGG GATGCGGACC AGTCTATCTA CGGTTGGCGT 8940 

GGTGCTGATA TGCAGAATAT CTTGGACTTT GAAAAGGATT ACCCCAAAGC CAAGGTTGTT 9000 

TTGTTGGAGG AAAATTACCG CTCAACCAAA ACCATTCTCC AAGCGGCCAA CGAGGTTATT 9060 

AAAAATAATA AAAATCGCCG TCCTAAAAAT CTCTGGACTC AAAACGCTGA TGGGGAGCAA 9120 

ATCGTTTACT ATCGTGCCGA TGATGAGCTG GATGAGGCTG TATTTGTAGC CAGAACCATC 9180 

CATGAACTTA GTCCCAGTCA AAACTTCCTT CATAAGGATT TTGCAGTTCT CTATCGGACT 9240 

AATGCCCAGT CCCGTACAAT TGAGGAAGCC CTGCTCAAGT CTAACATTCC TTATACCATG 9300 

GTTGGCGGAA CCAAATTCTA CAGCCGTAAG GAAATTCGCG ATATTATTGC TTATCTCAAC 9360 

CTTATTGCTA ATTTGAGTGA CAATATTAGT TTTGAGCGTA TTATCAACGA GCCTAAACGT 94 20 

GGAATTCGTC TAGGTACAGT TGAGAAAATC CGTGATTTTG CAAATTTGCA AAATATGTCT 94 80 

ATCCTGGATG CTTCTGCTAA TATTATGTTG TCTGGTATCA AGGGTAAGGC AGCCCAATCT 9540 

ATCTGGGATT TTGCCAATAT GATGCTTGAT TTGCGGGAGC AGCTAGACCA CTTAAGCATT 9600 

ACAGAGTTGG TTGAGTCCGT CCTAGAAAAA ACAGGTTATG TCGATATTCT TAACTCCCAA 9660 

GCGACTCTAG AAAGCAAGGC ACGGGTTGAA AATATCGAAG AGTTTCTTTC TGTTACGAAG 97 20 

AACTTTGATG ACACCACGGA TGTGACAGAA GAGGAAACTG GTCTCGACAA ACTGAGTCGT 9780 

TTCTTAAATG ACTTGGCTTT GATTGCCGAC ACAGATTCAG GTAGTCAGGA GACATCAGAA 9840 

CTGACCTTGA TGACCCTGCA TGCTGCCAAA GGTCTCGAAT TTCCAGTTGT CTTTTTGATT 9900 

GGGATGGAAG AAAATGTCTT TCCACTTAGT CGTGCGACTG AAGATTCAGA TGAATTAGAA 9960 

GAAGAGCGCC GTCTAGCCTA TGTAGGTATC ACGCGTGCAG AGAAAATTCT CTATCTGACC 10020 

AATGCCAACT CACGCTTGCT TTTTGGTCGT ACCAATTATA ACCGTCCGAC TCGTTTTATT 10080 

AACGAAATCA GTTCACACTT GCTTGAGTAT CAAGGTCTGG CTCGTCCTGC AAATACAAGC 10140 

TTTAAGGCAT CAT AT AG C AG TGGTAGTATT TCCTTTGGTC AAGGTATGAG TTTGGCTCAG 10200 

GCTCTTCAAG ACCGTAAACG CGGTGCTGCC CCAAAATCAA TCCAGTCAAG CGGTCTTCCA 10260 

TTTGGTCAAT TTACAGCTGG CGCAAAACCA GCATCTAGCG AGGCAAATTG GTCCATTGGT 10320 

GATATTGCTC TCCACAACAA ATGGGGAGAG GGAACCGTTC TGGAAGTTTC AGGTAGCGGT 10380 

GCTAGGCAGG AATTGAAAAT CAATTTCCCA GAAGTAGGTT TGAAAAAACT TTTAGCCAGT 10440 

GTGGCTCCAA TTGAGAAAAA AATCTAATTT TCCATCCTTC TCACGAATAA TAAAGTGAGG 10500 
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AGGATTTTTA TGTACAGTAT TTCATTCCAA GAAGATTCAC TATTACCAAG AGAAAGGCTG 
GCCAAGGAAG GAGTTGAAGC GCTTAGTAAC CAAGAGTTGC TAGCTATTTT ACTCAGGACA 
GGAACACGTC AAGCTAGCGT TTTTGAAATT GCCCAAAAAG TCTTGAACAA TCTTTCAAGC 
CTAACGGATT TGAAAAAAAT GACCCTGCAG GAATTGCAGA GTTTGTCTGG TATTGGGCGT 
GTTAAGGCCA TAGAATTACA AGCTATGATT GAACTGGGGC ATCGTATTCA CAAACACGAG 
ACTCTTGAAA TGGAAAGTAT TCTCAGCAGT CAAAAGTTGG CCAAGAAGAT GCAGCAGGAA 
TTAGGGGATA AAAAACAAGA GCACCTGGTG GCACTCTATC TCAATACTCA AAATCAAATC 
ATCCATCAGC AGACCATTTT TATCGGGTCT GTAACTCGTA CTATCGCTGA ACCGCGAGAG 
ATTCTTCACT ATGCAATCAA GCATATGGCG ACTTCTCTTA TCTTGGTCCA CAATCATCCT 

CCAAAATGAT GATCATGTCA CTAAACTTCT TAAAGAAGCC 

TCTCTCATTC TAATTACTTT 
CGACATAGTC AAAGAGTTTT 



TCAGGAGCGG TAGCGCCTAG 
TGCGAATTGA TGGGGATTGT 
AGTTATCCTG AAAAGACAGA 
TTATCTTTGG GACGATTTTC 
ACATCATCCG TACTCATGAC 



TCTCTTGGAC CATTTCATTG 
TTTAATCTAA AGTTCATTAA 



AAAAACAAGT TCTGGATGCC ATTGGACACC GAGAAAGGCG 
AGCCTCAATG ATACCATCTT TAGGATCATG AGCCACAACT 



TTTAAATTTG 
TCTCCATAGA 
TCAACAGAAG 
ACGTTAAAGA 
TCCTTGATGA 
GTTTTGGGTT 
TCAATCAAAC 
ATCCCTCCAG 
TCATCATCTG 
TGATTTTCGC 
TCAAATCGAC 
TGAGAATTCC 
TGGGAACAGT 
AAATCCCGTA 
GGTCAAAGGC 



GTGCTAAGTC 
TTTCTTGGAG 
AATCCTGCCA 
GTTGCGTACC 
GGGCCAGTTC 
CGCCATAAAA 
TGATATAGTG 
CATCTTTAAC 
GATGAGTTTT 
TTTCTAATCC 
ATACTGAACG 
TTTCACACCA 
TAGGATAGCA 
AATGGGAATC 
CATGATAATC 



CTTGATGCTC TGGTGGTGGA AGGAGTTGAT ATGAGAGATT 
AACGGTATCT GGTTCTGTTA CCAAGCGTTG AGTTGTGTAC 
ATGGTCTTCG ATATCTTGGT ACAAAGTTCC ACCCATGGCA 
ACGGCAGACA GAGAAAATGG GCTTTTTCTG TTTAATAGCT 
GAACATATCT CTTTGAAGGT GATACTCATC ACTATCAATG 
TTTTGGATCG ACATTTTGCC CACCTCTCAA GATGAGCTTG 
GCAGGCCATT TCTTGATCAC CAATCGGTAG GATGATGGGA 
GCCTTCAACA AAGCCTTTTG CTGCGTAGCT CATCATGATG 
TTCGTTTCCT GTAATCCCAA TAACTGGTTT TTTCATAAAA 
TCTTTTCGCA TGAAGTAGAG GAGGGTTTGG AGTTCACTTG 
ACCACGTCTT TTGGTAAATG CAGATGGACT GGTGAAAAAC 
GCATCAACCA AGAGATTAGC AACCTCTTGT GACTTGACGC 
GTCTTCACAT CAGCATCCTT GATTTTATCC TTGATCTGAG 
CCGTCAGGAG TTTGGGTACC GACTTCAGGA TGGTCGTCTA 
TTCATCTTGT TACGTTCGTG GAAGCGGTAG TGGAGAAGGG 



10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 
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CATGGCCCAT ATTTCCAATA CCAACCAGCA TGACATTGGT AATACAGTTG TCATTGAGCA 12300 

AATCGGCAAA AAATGTCATT AGTTTTTTGA CATCATAGCC AAAACCACGA CGACCAAGTT 12360 

CACCAAAATA GGAAAAATCA CGACGTACGG TCGCTGAATC AATACCGATA GCCTCTGCAA 12420 

TTTGCTTAGA GTTGGCACGT TCAATCTTTT CTGCATGAAA TCTCTTAAAA ATTCGATAGT 12480 

AGAGAGAGAG TCTTTTTGCT GTAGCTTTTG GAATAGCAAA CTGTTTATCT TTCACAAAAT 12540 

CACAACCTTT CTATTCTTCT ATTTTATAGA AACATTGTGA AAAAATCAAC AAAAATAAGA 12600 

AAAAACTAAG AAAAATCTTA GTTTTGATGT AAAAAATCTG CATGAGATAG AAAACGGTAG 12660 

AGGTCTCCGA CCAGCCCCTG ATAAACTTTT TTGCCCCTAA AAGTCAGAGA AGTCACATAA 12720 

AGTGTATCTG GTAAGGTTAC ACATCCTGAC AAAGTCAACA TGAGAGCCTC ATGATCCTCA 12780 

TACTTGAGAG TACGCTCTAC ATGATAGCAG TCCTTATAGG TCAGTTCAAA CATTTTGGCT 12840 

CTATCTTTCC GATTTTGTAA AGACACCACG TTCTACCAAG CTATCCATGA GGAAGTAGAA 12900 

TTTTTCCTGA TGAATATGGT GGTCTTCTGA TTTGAAAATA TCAACTAGAC GAAGGCCAAA 12960 

CTTGTCAGTG ATATTGATTT TAGCCCCTGT AAGTTCCTTG TTAATGATGA TTTTGAGTTG 13020 

GAAGCCTTCA CCGCTGTTTG GCACTTTTTC CAAAAGGCGA GTCAGTTCAT AGTTACCAAC 13080 

CTTAGTTTCA AAAAAGGTGT TATCTTTGAG GGTGAATTTT TTAACAGAAG GGCTAAGAGT 13140 

GTAATCGTAA CGACAATTTT TTAACTGAAT GATTTTTTCA AATGCCATAT GGCTAACCTC 13200 

CGATAATTTC TTTTAAGGTT TTTGCGAGGG TTTGTAGGTC TTCAACGGTA TTTTGTGGCG 13260 

ACAAACTGAT GCGAAGGGAT TCCTTCAAGC GTTCTGAATT TGCGCCATAC ATGGCTTCAA 13320 

GAACATGGCT GGATTGGACA ACGCCTGCAG TACAGGCTGA GCCAGTAGAG ATTGAAATTC 133 80 

CAGCTAAATC TAGCCGAAGG AGTAAGAGGT CATTTTTCTG ACCAGGAAAT CCAATATTGA 13440 

GAACATAAGG GAGATGATGT TTTCCTCTAT TCAGGTAATA CTGAATGCCC TCCAGCTCTG 13500 

CCAGAAAGGC AGTTTCTAGA TTTTGTACAT GTTGAAAATG TTCTTCTTGT TTTTCTAGGT 13560 

CTTCTTTTAG GGCTGCAACC ATGCCTACAA TGGCAGGCAG ATTTTCAGTT CCTGCACGTT 13620 

TTTTCTGTTC CTGGTCTCCG CCATGTAGAT AGGAATCAAA GTCCATGCTA GATGCGTAGA 13680 

GAAAACCGAT TCCCTTAGGA CCATGGAATT TGTGGGCAGA AGCAGTGAGA AAATCAATGC 13740 

CCAATTCTTC TGAATGAATT GGGATTTTAC CAATAGCCTG AACTGCATCA ACATGATAGG 13800 

CAGCAGGGTG TTGCTTGAGT ATTTGGCCAA TTTCAGCGAT GGGCAGTAGG TTTCCTGTCT 13860 

CATTATTGAC AAACATGGTA GAAACCAAAA TCGTATCGTC ACGTAAAGCC TTTTGAATTT 13920 

GCTGGGCTGT GATTTCTTGA TTTTCTGGCT GGATAATGGT TGCTTCAAAC CCAAAGTGTT 13980 

GAACCAAGTA ATCAATTGTT TCAAGGACAG CATGGTGCTC GATGGCAGTT GTGATGATAT 14040 
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GTTTTCCTTG TTCTTGGTGA CGAAGACAGT AGCCAATGAT GGTAGTATTA TTGCCTTCAG 14100 

TCCCACCAGA AGTGAAAAAG ATATGTTGAG GTTTTGTCCT TAGTAACTGG GCTAGTTCCT 14160 

GACGGGCTTC TCGCAAGAGT TTGCCAGCTT GACGACCATG ACCATGAATA CTAGAAGGAT 14220 

TTCCGTGGGT TTCTTGCATA ACCTTGGTCA TAGCTGAAAT AGCAACTGCT GACATAGGAG 14280 

TCGTTGCAGC ATTGTCCAAA TAAATCAAAC AATCACCTTA TTTCTTTTTA TTGTAGGCAA 14 340 

AGAGTGGGCT GACTGGTTTT CTTTCGTGAA TACGGACGAT AGCATCACCA ATTAACTCAC 14400 

TAGCAGTGAT GTAGCATACA TTTTTAGGAG TTTTTTCTTT TGTTGCTACT GAATCAGTCA 144 60 

CAAGAATTTC TTTAATATTA GTATTGTCAA GAAGCTCAGC AGCTCCCTCG ACGAAGAGAC 14520 

CGTGGCTAGA AACAGCATAA ATTTCTGTAG CTCCTTCACG TTCAACGATT TTACAAGCTT 14580 

CAGAGAAGGT ACGTCCTGTA TTTAAAATAT CATCAATCAA GATAGCTTTC TTACCTTCAA 14 640 

CATCACCAAT AATATAACCT TCGTTACGAG TTGCATCGTC TTGAGGGTAG TCGATAATGG 14700 

CGATAGGAGC ATCAAGATAT TCAGCCAGGC TACGCGCACG TTTGACACCT GAATTTTTAG 14760 

GGCTAACGAC AACAACATCT GAACCAAGCA ATCCTTTATC GCAGTAATGT TTTGCGAATA 14920 

GGGGAACAGT GAAAAGATTA TCCACTGGAA TATCAAAGAA ACCTTGAACC TGAACGGCAT 14880 

GCAAATCAAG AGTCAGGATA CGATCAACTC CAGCCTTAAC CAGCATATTG GCAACTAGTT 14940 

TTGCTGTAAG TGGCTCACGA GGACAAGCAA TGCGGTCTTG ACGTGCATAG CCAAAATATG 15000 

GAAGGACAAC GTTGATACTG TGGGCACTTG CACGCACACA AGCATCGACC ATGATTAACA 15060 

ATTCCATTAG GTGGTTGTTG ACAGGGAAAC TTGTTGATTG GATGATGTAA ACATCATAAC 15120 

CACGGACACT TTCTTCGATA TTTACTTGGA TTTCTCCGTC TGAAAATTGA CGTGATGATA 15180 

GTTTTCCAAG TGGGACACCA ACAGCTTGGG CAATTTTTTG TGCAATCTCT TGGTTAGAGT 15240 

TGAGTGCGAA AAGTTTCATG TTTTTTCTAT CTGACATTAT AGACCGTCCT CTGTAAACTT 15300 

TATAAATCCT AGTTATATTT ACCTTACATA TATGAACTGG GATTTGTGTA TTTTTATCTT 15360 

TTCTATTTTA CCAAAAAATG GAGATTATTT CAGCTATTTT TCATACTTTT GACAAATCCA 15420 

ACCAATTTTG AAGGAGCTTT TTGATAGGAA ATCTGATTTT TCTCTAAAAA TTGTCGAAAA 15480 

TCCTGTTTGC CTTGCTCATG ATTTTCCACT TCAAGCTCCA ATTCGTAATC TGTTATATCA 15540 

AAGTATCGGC TCTGATCCAG TGCCATGAGA CCAATAGCTG TTTTCATTTC ATAGCGAAGC 15600 

GTTGTTAGAC AACCAAGAAC CTGCCAGTTC TTACTTTGGA TACCATGTTT CGCCAATTCA 15660 

TCCAGTACTA GCCCTTGAGG AAGTTCTTCC TTACTCAGAT AGTTCTCAGC ATCTTTTAGT 15720 

TGCAATTTTT GGTTGTATTC CATGTTTCCA ACACTCTGCG GGACTTTGAG TGTCAACTCA 15780 
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GCCCACTCTT CAAAGCTTCG AATGCGCATA GCGACTTTCT TTrePGOCAC TTCAAAATCA 15840 

CGCGTGTCGA TGTAGTAATT TGTTTGAAGA ACAGGAGTGA CACCTGTGAA CTGGTCTTTT 15900 

AGACGATTGT ATTCATCTTT TTTCAATAGT GTTTTCAATT CAATTTCTAA ATGTTTCATT 15960 

TTTCTTACCT TTTTTTATCG TTGAAAGCGG ATTTATGGTA TAATAAGCAT TGTATTTATT 16020 

GTATATGAAT CTGGAGAAAA AATCAAAGAT ATTTTTGACG GATAATATGA GAACAAGGGA 16080 

GAATATATGA CCTTAGAATG GGAAGAATTT CTAGATCCTT ACATTCAAGC TGTTGGTGAG 16140 

TTAAAGATTA AACTTCGTGG TATTCGTAAG CAATATCGTA AGCAAAATAA GCATTCTCCA 16200 

ATTGAGTTTG TGACCGGTCG AGTCAAGCCA ATTGAGAGCA TCAAAGAAAA AATGGCTCGT 16260 

CGTGGCATTA CTTATGCGAC CTTGGAACAC GATTTGCAGG ATATTGCTGG CTTACGTGTG 16320 

ATGGTTCAGT TTGTAGATGA CGTCAAGGAA GTAGTGGATA TTTTGCACAA GCGTCAGGAT 16380 

ATGCGAATCA TACAGGAGCG AGATTACATT ACTCATAGAA AAGCATCAGG CTATCGTTCC 164 40 

TATCATGTGG TAGTAGAATA TACGGTTGAT ACCATCAATG GAGCTAAGAC TATTTTGGCA 16500 

GAAATTCAAA TTCGTACTTT GCCCATGAAT TTCTGGGCAA CGATAGAACA TTCTCTCAAC 16560 

TACAAGTACC AAGGGGATTT CCCAGATGAG ATTAAGAAGC GACTGGAAAT TACAGCTAGA 16620 

ATCGCCCATC AGTTGGATGA AGAAATGGGT GAAATTCGTG ATGATATCCA ACAAGCCCAG 16680 

GCACTTT T TG ATCCTTTGAG TAGAAAATTA AATGACGGTG TAGGAAACAG TGACGATACA 16740 

GATGAAGAAT ACAGGTAAAC GAATTGATCT GATAGCCAAT AGAAAACCGC AGAGTCAAAG 16800 

GGTTTTGTAT GAATTGCGAG ATCGTTTGAA GAGAAATCAG TTTATACTCA ATGATACCAA 16860 

TCCGGATATT GTCATTTCCA TTGGCGGGGA TGGTATGCTC TTGTCGGCCT TTCATAAGTA 16920 

CGAAAATCAG CTTGACAAGG TCCGCTTTAT CGGTCTTCAT ACTGGACATT TGGGCTTCTA 16980 

TACAGATTAT CGTGATTTTG AGTTGGACAA GCTAGTGACT AATTTGCAGC TAGATACTGC 17040 

GGCAAGGGTT TCTTACCCTG TTCTGAATGT GAAGGTCTTT CTTGAAAATG GTGAAGTTAA 17100 

GATTTTCAGA GCACTCAACG AAGCCAGCAT CCGCAGGTCT GATCGAACCA TGGTGGCAGA 17160 

TATTGTAATA AATGGTGTTC CCTTTGAACG TTTTCGTGGA GACGGGCTAA CAGTTTCGAC 17220 

ACCGACTGCT AGTACTGCCT ATAACAAGTC TCTTGGCGGT GCTGTTTTAC ACCCTACCAT 17280 

TGAAGCTTTG CAATTAACGG AAATTGCCAG CCTTAATAAT CGTGTCTATC GAACACTGGG 17340 

CTCTTCCATT ATTGTGCCTA AGAAGGATAA GATTGAACTT ATTCCAACAA GAAACGATTA 17400 

TCATACTATT TCGGTTGACA ATAGCGTTTA TTCTTTCCGT AATATTGAGC GTATTGAGTA 174 60 

TCAAATCGAC CATCATAAGA TTCACTTTGT CGCGACTCCT AGCCATACCA GTTTCTGGAA 17520 

CCGTGTTAAG GACGCCTTTA TCGGCGAGGT GGATGAATGA GGTTTGAATT TATCGCAGAT 17580 
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GAACATGTCA 


AGGTTAAGAC 


CTTCTTAAAA 


AAGCACCAGG 


TTTCTAAGGG 


ATTGCTGGCC 


17640 


AAGATTAAGT 


TTCGAGGTGG 


AGCTATTCTG 


GTCAATAATC 


AACCGCAAAA 


TGCAACGTAT 


17700 


CTATTGGACG 


TTGGAGACTA 


CGTTACCATT 


GACATTCCCG 


CTGAGAAAGG 


CTTTGAAACC 


17760 


TTGGAGGCTA 


TTGAGCTTCC 


ATT AG AT ATT 


CTCTATGAGG 


ATGACCACTT 


TCTAGTCTTG 


17820 


AATAAACCCT 


ATGGAGTGGC 


TTCTATTCCT 


AGTGTCAATC 


ACTCTAATAC 


CATTGCCAAT 


17880 


TTTATCAAGG 


GTTACTATGT 


CAAGCAAAAT 


TATGAAAATC 


AGCAGGTTCA 


CATTGTTACC 


17940 


AGACTAGATA 


GGGATACTTC 


TGGCTTGATG 


CTCTTTGCCA 


AGCACGGTTA 


TGCCCATGCA 


18000 


CGATTAGACA 


AGCAGTTGCA 


GAAGAAATCT 


ATCGAGAAAC 


GCTACTTTCC 


TTTGGTTAAG 


18060 


GGAGATGGAC 


ATTTGGAGCC 


AGAAGGGGAA 


ATTATTGCTC 


CGATTGCGCG 


TGATGAAGAT 


18120 


TCCATTATTA 


CCAGACGAGT 


GGCTAAAGGC 


GGAAAGTATG 


CCCATACTTC 


ATACAAGATT 


18180 


GTAGCTTCTT 


ATGGAAATAT 


TCACTTGGTC 


TATATTCACC 


TGCACACTGG 


TCGAACCCAT 


18240 


CAAATCCCAG 


TCCATTTTTC 


TCATATCGGT 


TTTCCTTTGC 


TGGG AG ATG A 


TTTCTATGGT 


18300 


GGTAGTCTCG 


AAGATGGTAT 


TCAACCTCAG 


GCTCTGCATT 


GCCATTACCT 


ATCCTTTTAT 


18360 


CATCCATTTT 


TAGAGCAAGA 


CTTGCAGTTA 


GAAACTCCCT 


TGCCGGATGA 


TTTTAGTAAC 


18420 


CTTATTACCC 


AGTTATCAAC 


TAATACTCTA 


TAAAAACTGT 


CTCAGAGTAT 


AATTATTATC 


18480 


TTAAAGGAGA 


AAACTCATGG 


AAGTTTTTGA 


AAGTCTCAAA 


GCCAACCTTG 


TTGGTAAAAA 


18540 


TGCTCGTATC 


GTTCTCCCTG 


AAGCGGAAGA 


GCCTCGTATT 


CTTCAAGCAA 


CAAAACGCTT 


18600 


AGTAAAAGAA 


ACAGAAGTGA 


TTCCTGTTTT 


GCTTCGAAAT 


CCTGAAAAAA 


TTAAAATTTA 


18660 


TCTTGAAATT 


GAAGGAATCA 


TGGATGGTTA 


TGAGGTCATC 


GACCCTCAAC 


ATTATCCTCA 


18720 


ATTTGAAGAA 


ATGCTTTCTG 


CCTTGGTGGA 


GCGTCGCAAG 


GGC AAAATG A 


CTGAAGAACA 


18780 


TGTACGCAAG 


GTTTTGGTTG 


AACATGTCAA 


CTACTTTGGT 


GTGATGTTGG 


TTTACTTGGG 


18840 


CTTGGTTGAT 


GGAATGGTCT 


CAGGAGCGAT 


TCACTCAACA 


GCTTCAACAG 


TTCGCCCAGC 


18900 


TCTACAAATC 


ATCAAAACTC 


GTCCAAATGT 


AACTCGTACT 


TCAGGAGCCT 


TCCTCATGGT 


18960 


TCGTGGTACG 


GAACGTTACC 


TATTTGGAGA 


CTGTGCCATT 


AACATCAATC 


CAGATGCAGA 


19020 


AGCCTTGGCT 


GAAATTGCCA 


TCAACTCAGC 


AATCACACCT 


AAGATGTTTG 


GCATCGAACC 


19080 


TAAAATTGCC 


ATGTTGAGCT 


ATTCTACTAA 


AGGTTCAGGG 


TTTGGTGAAA 


GCGTTGATAA 


19140 


GGTCGTTGAA 


GCAACTAAAA 


TTGCTCACGA 


CTTGCGTCCT 


GACCTTGAAA 


TCGATGGTGA 


19200 


GTTGCAATTT 


GATGCAGCCT 


TTGTTCCTGA 


AACTGCAGCT 


CTGAAAGCTC 


CTGGAAGTAC 


19260 


GGTAGCTGCT 


CAAGCAAATG 


TCTTCATCTT 


CCCAGGTATC 


GAGGCAGGAA 


ATATTGGTTA 


19320 
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CAAGATCGCT GAACGCCTGG GTCGCTTTCC GGCTGTAGGA CCTGTTTTGC AAGGTTTAAA 19380 

CAAGCCAGTT AATGATCTTT CTCGTGGATG TAATGCAGAT GATGTTTACA AGTTGACCCT 19440 

CATCACAGCA GCTCAAGCAG TTCATCAATA GTGAAAACTA TAAAGTGATA TACTATGCTA 19500 

TACTGTAGTT ATGAAACTAT GTACGAAAAG CACTGCCATT AATTCCTGAG AACTAAATTA 19560 

CTGATTGGTG TCAAAAAGGA AAACTTCCAA GCGATGATAT CCTGTCTATA CACGACCTAT 19620 

AGAAATCTGT AATATACATA TCCGTAAAAC GATAAATTCC CTTTTTGATT TTAAATGAGT 19680 

ATGAAAAGAG AATTTTTTGG CTCTTTGTCA ACTGTAGTGG GTTGAAGAAA AGCTAAGCTC 19740 

GAGAAAGGAC AAATTTCATC CTTTCTTTTT TGATATTCAG AGCGATAAAA ATCCGTTTTT 19800 

TGAAGTTTTC AAAGTTCCGA AAACCAAAGG CATTGCGCTT GATAAGTTTG ATGAGATTAT 19860 

TGGTCGCTTC CAGTTTGGCG TTAGAATAGT GTAGTTGAAG GGCGTTGATA ATCTTTTCTT 19920 

TATCTTTGAG GAAGGTTTTA AAGACAGTCT GAAAAATAGG ATGAACCTGC TTAAGATTGT 19980 

CCTCAATAAG TCCGAAAAAT TTCTCTGGTT CCTTATTCTG GAACTGAAAA AGCAAGAGTT 20040 

GATAGAGCTG ATAGTGGTGT TTCAAGTCTT CCGAATAGCT CAAAAGCTTG TTTAAAATCT 20100 

CTTTATTGGT TAAGTGCATA CGAAAAATAG GACGATAAAA TCGCTTATCA CTCAGTTTAC 20160 

GGCTATCCTG TTGAATGAGT TTCCAGTAGC GCTTGATAG 20199 

(2) INFORMATION FOR SEQ ID NO: 7: 

<i> SEQUENCE CHARACTERISTICS : 

{Ay LENGTH: 19702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7; 

ACCCGATGTA TCAGCGGATA TTTACTCTAT TTTTCAAACG ATGTTATACC CACAATAAAA 60 

GAAAAAAGAC CCTAAGGTCT CCTTTGCTTT T ATT ATT AAA CGCGTTCAAC TTTACCTG/YT 120 

TTCAAAGCAC GAGCTGAAGC CCAAACTTTT TTAGGTTTAC CATCGATAAG AACAGTAACT 180 

TTTTGAAGGT TTGGTTTTAC GGCACGTTTT GTTTGGTTCA TCCCGTGTGA ACGGTTGTTT 2 40 

CCTGATACAG TCTTACGACC TGTAAAGTAA CATACTTTAG CCATTGTGTT TTCCTCCTAT 300 

TAGATCTAAT ATAGCGGATG TGCTAGCACC ACATACCGTA CTATGTTATC ACATTTTCTT 360 

GTTTTTTGCA AGGGAATTGG AAGATTTTTT ATTTGTGTCT TAAATCAGGT CTTGCGTGAC 420 

ATTTcTGCTC TCCACATGCC ATCGTTGATT AACAGAACAC CAGAATTAAA ATTATGTGTA 480 

TAAAAATCAT CTCTAACTGC AGCTAAGGGT ATAGCCCTCA AGTCCAAATC CCACAGCTCA 540 
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TCTATCGATT 


TTCTTACAAC 


AATATCTGAA 


TCCAAATACA 


GTACACGAGA 


CTCGCTTACA 


600 


TACTTTGGAA 


TAAAAT AC CT 


AAAAAAGCCG 


CATATGAAAG 


TCCCTCAAAG 


GGGAGACGAT 


660 


AACCTTTCAG 


AATATTACTG 


TCAATCTAAA 


CATTCACAAT 


CTCACTATTC 


AAAGTCTCTA 


720 


GTCTTTTTTC 


CATCAATTGG 


AACCATTCTC 


GCGGAAGGTC 


ATCATTAAAA 


ACATAAAACT 


780 


TAAGATTATA 


ATGATGAACA 


CAAAGAGATT 


TTATTGTTGT 


TTCAACTTTA 


TCCATATAAG 


840 


CATTATCTGC 


ACCTAAGACA 


ATCGCTTTTT 


TCTCTTCTTT 


CACTTTTTAT 


CTCATTTCTT 


900 


TTTATTCCCA 


TC AT ATT ATT 


CCCATCATAT 


GTTTCCCATC 


ATATGTTTCT 


ACGTAACCAT 


960 


TATTTTCGCC 


TATTCGTTCG 


TAAAACCATA 


CCAGTGGAGA 


TTTTAGATGA 


AGTCCCATTA 


1020 


CGGTTTACAA 


TTTTTACATT 


ACGACACGGA 


GTTTTACAAA 


TCGATTTCAT 


TTGCCAAACG 


1080 


TAGTTAGTGA 


GGCAGTTAGC 


TAGTTCGCCA 


AATAGCGACT 


AGCGTCCAAC 


AATTTGGAAC 


1140 


TTTAGTTCCA 


ATTGTTGGTA 


CTGAGTCACA 


TCTTCTCCTC 


TAACTCTACG 


TCTGGATACT 


1200 


TGTCCGCAAA 


CCAGCGGAGG 


GCAAAGTCAT 


TTTCAAAGAG 


AAAGACTGGT 


TGGTCAAAAC 


1260 


GGTCTTTGGC 


TAAGATATTG 


CGACTTGACG 


ACATCCGTTC 


ATCCAAGTCC 


TCAGGCTTGA 


1320 


TCCAACGAAC 


GGTCTTTTTA 


CCCATTGGGT 


TCATAACTAC 


TTCCGCATTG 


TACTCGCCTT 


1380 


CCATGCGGTG 


TTTAAAGACT 


TCAAACTGGA 


GTTGACCTAC 


AGCGCCTAGC 


ATGTACTCAC 


1440 


CTGTTTGGTA 


ATTCTTATAA 


AGCTGAACGG 


CTCCTTCTTG 


CACCAATTCC 


TCAATCCCCT 


1500 


TGTCGAAGCA 


TTTTTGCTTC 


ATAACATTCT 


TAGCAGAAAC 


TTTCATGAAA 


ATCTCAGGTG 


1560 


TAAAGGTTGG 


CAGGGGTTCA 


AATTCAAACT 


TGTTTTTTCC 


AACCGTCAAG 


GTATCCCCAA 


1620 


CCTGATAAGT 


ACCGGTATCG 


TAAACCCCGA 


TAATATCACC 


TGCCACCGCA 


TTGGTCACAT 


1680 


TCTCACGACT 


CTCCGCCATA 


AACTGGGT W 


CATTAGATAG 


TTTAGCC'TC 


T7ACCAGTAC 


1740 


GACGGAGATT 


GACACTCATG 


CCGCGCTCAA 


ATTCGCCAGA 


TACGATACGG 


ACAAAGGCAA 


1800 


TACGGTCACG 


GTGACGAGGG 


TCCATGTTGG 


CTTGGATTTT 


AAAGACAAAG 


CCTGAGAAAT 


1860 


CCTTGTCATA 


AGGATCCACA 


ATTTCACCGT 


CTGTTTTCTT 


GTGACCATGT 


GGTTCTGGAG 


1920 


CAAACTTGAG 


GAAGGTTTCA 


AGGAAGGTCT 


GCACACCAAA 


GTTTGTCAGG 


GCTGAACCGA 


1980 


AAAAGACAGG 


CGTCAATTCT 


CCAGCCAGAA 


TAGCTTCCTC 


TGAAAACTCA 


TTCCCGGCTT 


2040 


CATTTAAAAG 


CTCAATGTCA 


TCCTTGACTT 


GCTCGTAGAA 


AGGATTGCTA 


CCAAAGAGTT 


2100 


TGTCCCCGTC 


TTCTAGACTG 


GCAAAACGCT 


CATCCCCTTT 


GTAAAGCTCT 


AAACGTTGGT 


2160 


TATAGAGGTC 


ATACAAGCCC 


TCAAAGGCTT 


TCCCCATCCC 


GATAGGCCAG 


TTCATAGGGT 


2220 


AGCTAGCAAT 


GCCCAAGATT 


TCTTCCAATT 


CTTGCAAGAG 


ATCCAAAGGC 


TCACGACCGT 


2280 
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CACGGTCCAG CTTGTTCATA AAGGTAAAGA CTGGAATGCC ACGATGTTTC ACAACCTCAA 2340 

ACAATTTCTT GGTTTGAGCC TCGATCCCCT TGGCAGAGTC CACGACCATG ACCGCAGCAT 2400 

CCACCGCCAT CAAGGTACGA TAGGTATCTT CTGAGAAGTC CTCGTGCCCT GGCGTGTCTA 24 60 

AGATATTCAC GCGCTTGCCG TCGTAGTCAA ATTGCATAAC AGATGAAGTA ACAGAAATCC 2520 

CACGTTGCTT CTCGATATCC ATCCAGTCAG ATTTAGCAAA AGTCCCTGTT TTCTTCCCTT 2580 

TTACCGTACC AGCCTCACGA ATCTCACCCC CAAAGTAGAG TAACTGCTCA GTGATGGTTG 2640 

TTTTCCCCGC GTCCGGGTGG GAGATAATGG CAAAGGTACG ACGTTTCTTA ATTTCTTCTT 2700 

GAATATTCAT AAGTTCTCTT TCTTTGATTC TCTATTTTTC TTGTTTCAAT AGCTGAGAAT 2760 

CATTTTTACA TTGGATTTTA CCATTCCTTT CAACACTCCA TTATATCGCA TTTTAGCATT 2820 

TTTTTCAATT TCTATTTCTT TTCACTTCCC CCTCCCTTAT TTATAGGAAA ATATGGTAAA 2880 

ATAGAACAGA CTAAAAATCA TCATTTCACG AAAGGATGCA AGATGAAAAT TACGCAAGAA 2940 

GAGGTAACAC ACGTTGCCAA TCTTTCAAAA TTAAGATTCT CTGAAGAAGA AACTGCTGCC 3000 

TTTGCG AC CA CCTTGTCTAA GATTGTTGAC ATGCTTGAAT TGCTGGGCGA AGTTGACACA 3060 

ACTGGTGTCG CACCTACTAC GACTATGGCT GACCGCAAGA CTGTACTCCG CCCTGATGTG 3120 

GCCGAAGAAG GAATAGACCG TGATCGCTTG TTTAAAAACG TACCTGAAAA ACACAACTAC 3180 

TATATCAAGG TGCCAGCTAT CCTAGACAAT GGAGGAGATG CCTAATGACT TTTAACAATA 3240 

AAACTATTGA AGAGTTGCAC AATCTCCTTG TCTCTAAGGA AATTTCTGCA ACAGAATTGA 3 300 

CCCAAGCAAC ACTTGAAAAT ATCAAGTCTC CTGAGGAAGC CCTCAATTCA TTTGTCACCA 3 360 

TCGCTGAGGA GCAAGCTCTT GTTCAAGCTA AAGCCATTGA TGAAGCTGGA ATTGATGCTG 3420 

ACAATGTCCT TTCAGGAATT CCACTTGCTG TTAAGGATAA CATCTCTACA GACGGTATTC 3480 

TCACAACTGC TGCCTCAAAA ATGCTCTACA ACTATGAGCC AATCT TTG AT GCGACAGCTC 3540 

TTGCCAATGC AAAAACCAAG GGCATGATTG TCGTTGGAAA GACCAACATG GACGAATTTG 3600 

CTATGGGTGG TTCAGGTGAA ACTTCACACT ACGGAGCAAC TAAAAACGCT TGGAACCACA 3 660 

GCAAGGTTCC TGGTGGGTCA TCAAGTGGTT CTGCCGCAGC TGTAGCCTCA GGACAAGTrC 3720 

GCTTGTCACT TGGTTCTGAT ACTGGTGGTT CCATCCGCCA ACCTGCTGCC TTCAACGGAA 3780 

TCGTTGGTCT CAAACCAACC TACGGAACAG TTTCACGTTT CGGTCTCATT GCCTTTGGTA 3840 

GCTCATTAGA CCAGATTGGA CCTTTTGCTC CTACTGTTAA GGAAAATGCC CTCTTGCTCA 3900 

ACGCTATTGC CAGCGAAGAT GCTAAAGACT CTACTTCTGC TCCTGTCCGC ATCGCCGACT 3960 

TTACTTCAAA AATCGGCCAA GACATCAAGG GTATGAAAAT CGCTTTGCCT AAGGAATACC 4020 

TAGGCGAAGG AATTGATCCA GAGGTTAAGG AAACAATCTT AAACGCGCCC AAACACTTTC 4080 
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AAAAATTGGG 


TGCTATCGTC 


GAAGAAGTCA 


GCCTTCCTCA 


CTCTAAATAC 


GGTGTTGCCG 


4140 


TTTATTACAT 


CATCGCTTCA 


TCAGAAGCTT 


CATCAAACTT 


GCAACCCTTC 


GACGGTATCC 


4200 


GTTACGGCTA 


TCGCGCAGAA 


GATGCAACCA 


ACCTTGATGA 


AATCTATGTA 


AACAGCCGAA 


4260 


GCCAAGGTTT 


TGGTGAAGAG 


GTAAAACGTC 


GTATCATGCT 


GGGTACTTTC 


AGTCTTTCAT 


4320 


CAGGTTACTA 


TGATCCCTAC 


TACAAAAAGG 


CTGGTCAAGT 


CCGTACCCTC 


ATCATTCAAG 


4380 


ATTTCGAAAA 


AGTCTTCGCG 


GATTACGATT 


TGATTTTGGG 


TCCAACTGCT 


CCAAGTGTTG 


4440 


CCTATGACTT 


GGATTCTCTC 


AACCATGACC 


CAGTTGCCAT 


GTACTTAGCC 


GACCTATTGA 


4500 


CCATACCTGT 


AAACTTGGCA 


GGACTGCCTG 


GAATTTCGAT 


TCCTGCTGGA 


TTCTCTCAAG 


4560 


GTCTACCTGT 


CGGACTCCAA 


TTGATTGGTC 


CCAAGTACTC 


TGAGGAAACC 


ATTTACCAAG 


4620 


CTGCTGCTGC 


TTTTGAAGCA 


ACAACAGACT 


ACCACAAACA 


ACAACCCGTG 


ATTTTTGGAG 


4680 


GTGACAACTA 


ATGAACTTTG 


AAACAGTCAT 


CCGACTTGAA 


GTCCACGTAG 


AGCTCAACAC 


4740 


CAATTCAAAA 


ATCTTCTCAC 


CTACTTCTGC 


CCACTTTGGA 


AATGACCAAA 


ATGCCAACAC 


4800 


TAACGTGATT 


GACTGGTCTT 


TCCCAGGAGT 


TCTACCAGTT 


CTCAATAAAG 


GGGTTGTTGA 


4860 


TGCCGGTATC 


AAGGCTGCTC 


TTGCCCTCAA 


CATGGACATC 


CACAAAAAGA 


TGCACTTTGA 


4920 


CCGCAAGAAC 


TACTTCTATC 


CTGATAACCC 


CAAAGCCTAC 


CAAATTTCTC 


AGTTTGATGA 


4980 


ACCAATCGGA 


TATAATGGCT 


GGATTGAAGT 


CAAACTAGAA 


GACGGTACGA 


CCAAGAAAAT 


5040 


CGGTATCGAA 


CGTGCCCACC 


TAGAGGAAGA 


CGCTGGTAAA 


AACACCCATG 


GTACAGATGG 


5100 


CTACTCTTAT 


GTTGACCTCA 


ACCGCCAAGG 


GCTTCCCTTG 


ATTGAGATTC 


TATCTGAGGC 


5160 


AGATATGCGT 


TCTCCTGAAG 


AAGCCTATGC 


TTATCTGACA 


GCCCTCAAGC 


AAGTTATCCA 


5220 


GTACGCTGGC 


ATTTCTGACG 


TTAAGATGGA 


GGAAGGTTCG 


ATGCGTGTGG 


ATGCCAACAT 


5280 


CTCCCTTCGT 


CCTTATGGTC 


AAGAGAAATT 


CGGTACCAAG 


ACTGAATTGA 


AGAACCTCAA 


5340 


CTCCTTCTCA 


AACGTTCGTA 


AAGGTCTTGA 


ATACGAAGTC 


CAACGCCAGG 


CTGAAATTCT 


5400 


TCGCTCAGGT 


GGTCAAATCC 


GCCAAGAAAC 


ACGCCGTTAC 


GATGAAGCGA 


ATAAAGCAAC 


5460 


CATCCTCATG 


CGTGTCAAGG 


AAGGGGCTGC 


TGACTACCGC 


TACTTCCCAG 


AACCAGACCT 


5520 


ACCCCTCTTT 


GAAATTTCTG 


ACGAGTGGAT 


TGAGGAAATG 


CGGACTGAGT 


TGCCAGAGTT 


5580 


TCCAAAAGAA 


CGTCGTGCGC 


GTTATGTATC 


TGACCTTGGT 


TTATCAGACT 


ACGATGCTAG 


5640 


TCAGTTGACT 


GCTAATAAAG 


TCACTTCTGA 


CTTCTTTGAA 


AAAGCTGTTG 


CCCTAGGTGG 


5700 


TGATGCCAAA 


CAAGTCTCTA 


ACTGGCTCCA 


AGGGGAAGTC 


GCTCAGTTCT 


TGAATGCTGA 


5760 


AGGTAAAACA 


CTGGAACAAA 


TCGAATTGAC 


ACCAGAAAAC 


TTGGTTGAAA 


TGATTGCCAT 


5820 
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CATCGAAGAC GGTACTATTT CATCTAAGAT TGCCAAGAAA GTCTTTGTCC ATCTAGCTAA 
AAATGGCGGT GGCGCGCGTG AATACGTGGA AAAAGCAGGT ATGGTTCAAA TTTCAGATCC 
AGCTATCTTG ATCCCAATCA TCCACCAAGT CTTTGCCGAT AACGAAGCTG CTGTTGCCGA 
CTTCAAGTCA GGCAAACGTA ACGCCGACAA GGCtTTACAG GATTCCTTAT GAAGGCAACC 
AAAGGCCAAG CCAACCCACA AGTTGCCCTT AAACTACTTG CACAGGAATT GGCGAAGTTG 
AAAGAAAACT AGACAGAACA AAACCAGCCC TAAGGTTGGT TTTTTCTTCT CTACCAACTC 
CCAATAACTA TTTTGGCTTT ATTTCCAGAG TATTTTATGG TAAAATGAAG AGTAATAATA 
TTTATTAAAG AGGTAAAAAC ATGATTGAAG CAAGTACCTT AAAAGCTGGT ATGACCTTTG 
AAACAGCTGA CGGCAAATTG ATTCGCGTTT TGGAAGCTAG TCACCACAAA CCAGGTAAAG 
GAAACACGAT CATGCGTATG AAATTGCGTG ATGTCCGTAC TGGTTCTACA TTTGACACAA 
GCTACCGTCC AGAGGAAAAA TTTGAACAAG CTATTATCGA GACTGTCCCA GCTCAATACT 
TGTACAAAAT GGATGACACA GCATACTTCA TGAATACAGA AACTTATGAC CAATACGAAA 
TCCCTGTAGT CAATGTTGAA AACGAATTGC TTTACATCCT TGAAAACTCT GATGTGAAAA 
TCCAATTCTA CGGAACTGAA GTGATCGGTG TCACCGTTCC TACTACTGTT GAGTTGACAG 
TTGCTGAAAC TCAACCATCT ATCAAAGGTG CTACTGTTAC AGGTTCTGGT AAACCAGCAA 
CGATGGAAAC TGGACTTGTC GTAAACGTTC CAGACTTCAT CGAAGCAGGA CAAAAACTCG 
TTATCAACAC TGCAGAAGGA ACTTACGTTT CTCGTGCCTA ATCTCTAGAA AGAGGTCATT 
CTATGGGAAT TGAAGAACAA CTTGGCGAAA TCGTTATCGC CCCACGTGTA CTTGAAAAAA 
TCATTGCTAT CGCTACTGCA AAGGTAGAGG GTGTTCACTC TTTTTC AAAC AGATCAGTGT 
CTGATACCCT TTCAAAACTT TCACTCGGCC GTGGCATTTA TCTTAAAAAC GTGGACGAAG 
AACTCACAGC AGATATCTAT CTCTACCTTG AGTACGGAGT AAAAGTTCCT AAGGTAGCGG 
TTGCTATCCA GAAAGCTGTC AAAGATGCCG TCCGTAATAT GGCTGATGTA GAACTCGCTG 
CTATCAATAT TCACGTTGCA GGTATCGTCC CAGATAAAAC ACCAAAACCA GAATTGAAAG 
ATCTATTTGA CGAGGACTTC CTCAATGACT AGTCCACTAT TAGAATCTAG ACGCCAACTC 
CGTAAATGCG CTTTTCAAGC TCTCATGAGC CTTGAGTTCG GTACGGATGT CGAAACTGCT 
TGTCGTTTCG CCTATACTCA TGATCGTGAA GATACGGATG TACAACTTCC AGCCTTTTTG 
ATAGACCTCG TTTCTGGTGT TCAAGCTAAA AAGGAAGAAC TAGATAAGCA AATCACTCAG 
CATTTAAAAG CAGGTTGGAC CATTGAACGC TTAACGCTCG TGGAGAGAAA CCTCCTTCGC 
TTGGGAGTCT TTGAAATCAC TTCATTTGAC ACTCCTCAGC TGGTTGCTGT TAATGAAGCT 
ATCGAGCTTG CAAAGGACTT CTCCGATCAA AAATCTGCCC GTTTTATCAA TGGACTGCTC 



5890 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7 500 

7560 

7620 



WO 98/18931 



PCT/US97/19588 



AGCCAGTTTG 


TAACAGAAGA 


, ACAATAAGGC 


TCTTTGTCAA 


CTGTAGTGGG 


! TTGAAAAAAA 


7680 


GCTAAGCTCG 


AGAAAGGACA 


AATTTCGTCC 


TTTCTTTTTT 


GATGTTCAAA 


GCGATAAAAA 


7740 


TCCGTTTTTT 


GAAGTTTTCA 


AAGTTTCGAA 


AACCAAAGGC 


ATTGCGCTTG 


ATAAGTTTGA 


7800 


TGAGATTATT 


GGTCGCTTCC 


AGTTTGGCAT 


TAGAATAGTG 


TAGTTGAAGG 


GCGTTGACAA 


7860 


TCTTTTCTTT 


ATCTTTGAGG 


AAGGTTTTAA 


AGACAGTCTG 


AAAAATAGGA 


TGAGCCTGCT 


7920 


TAAGATTGTC 


CTCAATAAGT 


CCGAAAAATT 


TCTCTGGTTC 


CTTATTCTGG 


AAGTGAAACA 


7980 


GCAAGAGCTG 


ATAGAGCTGA 


TAGTGGTGTT 


TCAAGTCTTG 


TGAATGGC7C 


AAAAGCTTGT 


8040 


CTAAAATCTC 


TTTATTGGTT 


AAGTGCATAC 


GAAAAGTAGG 


ACGATAAAAT 


CGCTTATCAC 


8100 


TCAGTCTACG 


GCTATCCTGT 


TGAATGAGTT 


TCCAGTAGCG 


CTTGATATCC 


TTGTATTCAT 


8160 


GGGATTTTCG 


ATGAAACTGA 


TTCATGATTT 


GGACACGCAC 


ACGACTCATG 


CCACGGCTAA 


8220 


GATGTTGTAC 


AATGTGAAAG 


CGATCAAGAA 


CGATTTTAGC 


ATTCGGGAGT 


GAAACAGTCT 


8280 


GGGAGACTGT 


TTCAGCCTGA 


GCCTAGGAAT 


TTGAAAGCGA 


AGCTGTTTAG 


CCAAGTCATA 


8340 


CTAAGGGCTA 


AACATATCCA 


TAGTAATAAT 


TTTGACGCGA 


CATCGGACAA 


CTCTATCGTA 


8400 


GCGAAGAAAG 


TGATTTCCAA 


TGATAGCTTG 


TGTTCTACCC 


TCAAGAACAG 


TGATGATATT 


8460 


GAG ATTGTT A 


AAATCTTGCG 


CAATGAAGCT 


CATCTTTCCC 


TTTGTAAAAG 


CATACTCATC 


8520 


CCAAGACATA 


ATCTCAGGAA 


GACAAGAAAA 


ATCATGTTTA 


AAGTGAAAAT 


CATTGACCTT 


8580 


ACGAATAACA 


GTTGAAGTTG 


AGATGGAAAG 


CTGATGGGCA 


ATATCAGTCA 


TAGAAATCTT 


8640 


TTCAATCAAC 


TTTTGAGCAA 


TCTTTTGGTT 


GATGATACGA 


GGGATTTCGT 


GATTTTTCTT 


8700 


GACGATAGAA 


GTTTCAGCGA 


CCATCATTTT 


TGAACAGTGA 


TAGCACTTGA 


ATCGACGCTT 


8760 


TCTAAGGAGA 


ATTCTAGTAG 


GCATACCAGT 


CCTTTCAAGA 


TAAGGAATTT 


TAGAAGCTTT 


8820 


TTGAAAGTCA 


TATTTCTTCA 


ATTGGTTTCC 


GCACTCAGGG 


CAAGATGGGG 


CGTCGTAGTC 


8880 


CAGTTTGGCG 


ATGATTTCCT 


TGTGTGTATC 


CTTATTGATG 


ATGTCTAAAA 


TCTGGATATT 


3940 


AGGGTCTTTA 


ATGTCTAGTA 


ATTTTGTGAT 


AAAATGTAAT 


TGTTCCATAT 


GAATCTTTCT 


9000 


AATGAGTTGT 


TTTGTCGCTT 


TTCATTATAG 


GTCATATGGG 


ACTTTTTTTC 


TACAATAAAA 


9060 


TAGGCTCCAT 


AATATCTATA 


GGGGATTTAC 


CCACTACAAA 


TATTATAGAG 


CCAACAATAA 


9120 


AAAGAAAAAG 


TGTTTGATAG 


ATATCAAACA 


CTTTTTTCTT 


TGCCTCCCAC 


TATCTAAAAA 


9180 


AATGATAATA 


GATATAATTG 


TAAACAAAAA 


TCCAGATAGG 


TTTTGCATGA 


TTGAGAAAGT 


9240 


TAAAAAAACT 


ATGGCAGAGA 


ATCGTTAATC 


TCAGATTGTC 


GGTAGAACGA 


TAAACAAGGG 


9300 


CAAAAAAGAA 


ACCAATCAGA 


CTATAATATA 


ATAAACTAAT 


TGGATCTCTG 


TG AG AT ACT A 


9360 
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TCAAATGGCT 


AATCCCAAAG 


ATGATAGCAG 


ATAGGATAAC 


ATCCAAATAG 


TACTTGGACT 


9420 


AGGGAAAGAA 


GGTATTCATA 


AAATACCCTC 


TATCAAGAGT 


CTCCTCAAAA 


ACAGGACCGA 


9480 


TGATTACAGG 


CAGGACAAAA 


GATAAGATAG 


TCGATAAAAA 


GGTTGGTTGT 


CCATTTGAAA 


9540 


AAAGCACGGT 


AAAATACTCA 


TCATGAATAT 


TCCTATGATT 


AATCAAATGA 


GCATAGCGTG 


9600 


CCCAAAAATT 


ACCGAGAATC 


TGATAAACCA 


CATAAGTTGC 


AASTAAGTAG 


AAGACAAATG 


9660 


ACCAGTTCCA 


GCTCTTTTTC 


TCAAAGATAA 


AGACCATCTT 


TTTCTTTTTT 


AACCTCCAAA 


9720 


TTAATAGAAG 


GAAACTTCCC 


ACTAATCCCA 


TTGTTAAAAT 


AAGAGAATAG 


ACATCAGCTC 


9*>80 


CTAACCCTAA 


AATGATCGTC 


ACATACAATC 


CAATTGTTTG 


TGGTAAATAG 


GTAGATAGTA 


9840 


AAATAATAAG 


CAAAAATATT 


CCAAATTGTC 


TTAGTTTTTT 


TGTGTTTCTC 


ATCGTACTTT 


9900 


TTTGAAAGAT 


TACCCTGCTC 


GGAAGCCGTA 


CTTCCAAGCA 


TCTATATAAG 


AATTAAGTGC 


9960 


CCCTTGCCTC 


ATATAGGGAG 


CAAATTCTCT 


ATAATATAAC 


CATCTACTAT 


ATCCATCTTC 


10020 


CCAAACAGCA 


AGACCACCTG 


AAGTTTGCTC 


CAAGTCCTCA 


GTTGAAAGAA 


CTGTAAATGT 


10080 


ATTTGTACCT 


GTCATTGCAA 


GTACCTTCTT 


AAAATAGATT 


GTTGTAGGCT 


CACATTTATA 


10140 


GTATATTTCT 


TTTTTTGTCT 


ATTTTATAGC 


CCATCTCCTC 


AACTGGCAAT 


TTTTCGACCT 


10200 


GAATTACATT 


TTTCCATAAA 


AAATGAGACC 


TTTCTAGTCT 


CATTTAGTCA 


TTCTTAGTAT 


10260 


TTTCTAAATC 


GTTGATAGCG 


TTCTTCCAGC 


AACTCTTCTA 


GCGGTTTTTG 


TGAAAGTCTA 


10320 


GCCAGCTCCG 


TTTGGAGTTC 


TTTTTTGACA 


CTCTTAATCA 


GTTCTTTACT 


AGAAAGTCCT 


10360 


ATTTCAGAAA 


TCACCTTATC 


CACCACGTCC 


ATTTCTAACA 


GTTCATGCGA 


AGTCATTTTC 


10440 


ATCAGTTCTG 


CTGCTTCCAT 


AGCGCGAGTA 


CCCTCCTTCC 


ATAAAATGGA 


AGCAAAGCCT 


10500 


TCTGGACTGA 


GAATGGCATA 


GAT AGAATTT 


TCCAGCATCC 


AGACACGGTC 


CGCGACAGCT 


10560 


AGAGCCAGAG 


CCCCGCCTGA 


ACCACCTTCA 


CCGATAATAA 


TGGCGATAAT 


AGGAACTTTC 


10620 


AGGTCACTCA 


TTTCCATGAG 


ATTGCGAGCG 


ATAGCTTCCC 


CTTGACCACG 


TTCTTCCGCT 


10680 


CCGACACCAG 






AT AAAnfZTrVA 




GCCAAATTTC 


10740 


TCAGCCTGTT 


TCATCAACCG 


CAGTGCCTTT 


CGGTAGCCTT 


CTGGATGTGG 


TTGGCCAAAA 


10800 


TTCCCTTTGA 


GGTTGTCTTG 


CAAACTCTTG 


CCTTTTTGGA 


TACCAACCAC 


TGTTACAGCT 


10860 


TGGTCTCCAA 


GCCAACCAAT 


ACCACCAACA 


ACTGCACCAT 


CATCACGAAA 


AGAACGGTCA 


10920 


CCATGTAATT 


GGATAAATTC 


ATCAAAAATG 


CCTGTCGCAA 


AGTCCAAGGT 


TGTCAAGCGA 


10980 


CTCTGCTCAC 


GCGCTTCTCT 


GACTATTTTT 


GCAATATTCA 


TCTAGGACTC 


CCTCCATGCA 


11040 


ATCTGACTAG 


GCTAGCAATC 


GTATCTGGTA 


AGTCTCTTCT 


TTTGACAATA 


GCATCCACAA 


11100 


AGCCATGTTC 


TAATAGGAAT 


TCTGCCTTTT 


GGAAATCCTC 


AGGCAAGCTT 


TCACGAACCG 


11160 
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TATTTTCAAT CACACGACGC CCAGCAAAAC CAACCAAGCT CTGTGGTTCA GCCAGAATGA 11220 

TATCGCCTTC CATACCGAAA GAAGCTGTCA CACCACCACT CGTTGGATCT GTCAAAATGG 11280 

TCAGGTAAAA GAGACCAGCA TTTGAATGGC GTTTAACCGC CGCAGAGATC TTAGCCATCT 11340 

GC AT GAG ACT CATGATTCCT TCCTGCATAC GGGCTCCACC AGAGGCTGTG AATAGGACAA 11400 

CTGGCAATTT TTCGACAGTC GCATACTCAA ACAAACGAGT GATTTTTTCA CCTACAACCG 11460 

TACCCATAGA AGCCATGATA AAGTTAGAAT CCATAATCCC AAGAGCCACA GTCTGACCTT 11520 

TAATAAGAGC AGTTCCTGTC ACAACGGCTT CATGCAGACC TGTTTTTTCA CGCATAGATG 11580 

CCAGTTTCTT TTGGTAACCA GGGAAATGCA AGGGATCCTT GCTTTCAATC CCTGTAAACA 11640 

ATTCTTTGAA GGTTCCCATA TCAATCGTCA AAGCCAAGCG TTCTTGGGCA GAAATACGAA 11700 

AGGTATAGCT ACAGTGCGGA CAGATACGTT CACTTCCCAG ATCCTTCTGA TAGATGGTAT 11760 

GCTTACAGCC TGGACACTGG GAAAATAATT CATCTGGAAC CTCTGGCTTA GCTTGACGTT 11820 

TTTCCCTAAC CGAACGATTG GGATTGATTC GAATATACTT ATCTTTTTTA CTAAATAGAG 11B80 

CCATTGATTC CCCTTTTCGG TTTAAACTCT TAAAGTCATT TTATTCTTTT TCTTGATATT 11940 

TAGGTAAGAA GGTTTCCATC AAGAAGGAAG TATCATAATC CCCAGCAATG ACATTGCGAT 12000 

CTGAAATGAG GTCAAGCTGG AAATCTGCAT TGGTCTGCAC TCCTTCAATT TCTAATTCAT 12060 

AGAGGGCACG TTGCATTTTC ATCAAGGCGT CAAAACGATT TTCGCCGTGT ACTATGATTT 12120 

TGGCAATCAT ACTATCATAA TAAGGCGGAA TGGTATAACC TGGATAAACT GCTGAATCCA 12180 

CGCGCAAGCC AACTCCACCA CTTGGCAGAT AGAGATTAGT AATCTTACCT GGACTTGGAG 12210 

CAAAGTTAAA GGCTGGGTTT TCTGCATTGA TACGACACTC GATGGCATGA CCGCGTAGGA 12300 

CAATATCTTC TTGCTTAACA GACAAAGGCT GACCTGCCGC AATGCAAATC TGTTCCTTAA 12360 

CGATATCAAC ACCTGAAACA AACTCTGTTA CTGGATGTTC TACCTGAACA CGAGTATTCA 12420 

TCTCCATGAA ATAGAAATTG CTACTTGCTT CATCAAGAAG AAATTCAATG GTTCCTGCAT 12480 

TCTCATAGCC AACAAACTCT GCCGCTCGAA CAGCAGCAGC ACCTATTTCA TGACGCAGCC 12540 

TTTTTCCGAT TGCAATCGAG GGACTTTCTT CCAAAACCTT TTGGTTATTC CTTTGAAGAG 126O0 

AACAATCCCG TTCACCCAAG TGAATCACAT GTCCATGCTC ATCACCTAGG ATTTGAACCT 12660 

CAATGTGCCG AGCTGGATAG ATAACCCGTT CTATGTACAT GGCACCATTG CCATAATTGG 12720 

CCTTGGCCTC ACTAGAGGCA GTTTCAAAGG CAGAAACGAG GTCATCTGGT TTTTCAACCT 12780 

TACGAATCCC TTTACCACCT CCACCTGCTG AAGCCTTGAG CATAACAGGA TAGCCAATTT 12840 

TTTCAGCAAC AATCAAAGCT TCTTCAGAGT TATGCACTTC TCCATCTGAA CCTGGTATAA 12900 
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pah/^papapp 




ATPTGAGCAC 


GCGCATTGAT 


CTTATCCCCC 


ATCATATCCA 


12960 


TA APATP. APP 


AfTATfV; APPC 


ATAAAPTTGA 


TACCTACTTC 


TTCACACATG 


GTCGCAAATT 


13020 


TT!f: A ATTVTP 


AtTflAflAAAT 


CP. AAAACCAG 


GGTGAATAGC 


TTCTGCCTCA 


GTCAAGACTG 


13080 




AAPTHPATTA 


ATATTGAGAT 


AAGACTCTGT 


TGCCTTGCCA 


GGACCAATAC 


13140 


AnAv. i AjL. 1 1a. 


ATPTflPPAAA 


AGCGTATGAA 


GAGCTTCCTT 


ATCAGCAGTT 


GAATAAACCG 


13200 


it ja r~f*f**^rw 
l_ 1 Av_(_v» i UVj\_ 


iVn 1 V.LW»o-r\ 1 


TPAP£?PGCCG 


CACGGATAAT 


ACGAACCGCA 


ATTTCACCAC 


13260 




rri j\ * A AfTTTT 
iAAAAi 1111 


PP.AAAPATGG 


AGAACCTCCT 


TAGTTCCCAA 


TTGCAAAAGT 


13320 


AAGQ»i»TACl_A 


f'TVPP'TYTP A A 




CACTTC AG CC 


TTTGCTTCAA 


CCACAGCTAT 


13380 


GGTGCL.AC_l*A 






CATAAPCAAT 


TGGTCGCCTG 


GTACAACTTG 


13440 


CTTCTTGAAC 


TTAACLT Iaj I 


LLA 1 AL^aw. 


tTTAAAAflAPP 


AGTTTTCCTT 


TATTTTCAGG 


13500 


TTTTuATAAL 






CGCCAAGGCT 


TCCATAATCA 


CAACACCTGG 


13560 


CATAACTGGG 


TA1 IGAUoAA 






TCGTTGATGG 


TCACATTTTT 


13620 


GATAGCAACA 


A I LtU I A I tv. 1 


mere ipttp 


CAAGACACCG 


TCCACTAGAA 


GCATAGGATA 


13680 


ACGGTGGGGA 


AGAGCTTCTT 


1 VjA 1 1 LV. 1 lu 


AATATPfZATP 


ATTTGATACG 


TACCAATCCT 


13740 


TTACCAAACT 


CAACt A ill (_ 




APftAGAATTT 


CCCTTACCAC 


ACCATCCTTA 


13800 


GGAGCTGGGA 


TTTv. A I 1 T_ A 1 


VjAV> 111 LA 1 v* 




TTACCAATGT 


TTGACCTTTT 


13860 


TTGACACTAT 


LACv AAt. lu 1 




RGTTTATCTG 


GTCCAGCAGC 


CAAGTAAACC 


13920 


fk rf^ff*^^ ft ft A It ft 

ACTCCAACAA 




f APA AflATTT 


C PPTC AGT AG 


CCACACTTGC 


TTCAGCTGGA 


13980 


G CTGG AACTT 


I 1 v- 1 *jv- L AL 




GGAGCAGAT G 


TAGGAGCTAC 


TGGACTCGGT 


14040 


GTTGCTAGAA 




APPPAPTTTIA 


CTWflC* AJVCTT 

\J 4 4 i 4 4 


CAGGCACAGG 


TCTTGCTTCA 


14100 


TTCTTGCTAA 


At, 1 l_ 




TTTTTATAAG 

4 4 4 4 4 *» a nnv 


AAAATTCTCT 


CAAACTTGAC 


14160 


TGGTCAAA^rr 


a if*T f* a TP a a 
bAu I LA 1 AA 


PTTTTTA ATA 


4 ^ 4 4 4 A 


TCATACTTAT 


CTATTCTCCC 


14220 


AACGTTTGAA 


AGCAAGAACT 


GCATTGTGGC 


CTCCAAAACC 


AAAAGTATTT 


GAAATAGCGT 


14280 


ATGGAATTTC 


TTTCTCCAAG 


CCTTGTCCAT 


AAACGACATT 


AGCTTCGATA 


TAATCTGATA 


14340 


CTTCACTTGT 


CCCAGCTGTC 


ATTGGTACAA 


AGTTATGACG 


CATAGCTTCG 


ATGGTGACGA 


14400 


TAGCTTCTAC 


TGCACCCGCA 


GCCCCCAGCA 


AATGTCCTGT 


AAAAGACTTG 


GTTGATGATA 


14460 


CAGGTACTTC 


CTTACCAAGA 


ACAGCTACGA 


TAGCACCACT 


TTCTCCTTTT 


TCATTGGCAG 


14520 


GAGTTGACGT 


TCCGTGAGCA 


TTGACATAGG 


CTACTPGCTC 


TGGAGAAATC 


TCAGCTTCTT 


14580 


CCAAGGCTAG 


TTTGATGCCC 


TTGATAGCTC 


CCTGACCTTC 


TGGATGTGGA 


GAAGTCATGT 


14640 


GGTAGGCATC 


ACAAGTATTT 


CCGTAACCAA 


CCACTTCAGC 


CAGGATAGTA 


. GCTCCACGTT 


14700 
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TTTCAGCGTG TTCAAGACTT TCTAGAACCA ACATCCCTCA ACCTTCACCC ATAACAAACC 14 760 

CATTGCGATC CTTATCAAAT GGGATCGAAG CACGAGTTGG ATCCTCTGTA GTAGAGAGAG 14820 

CTGTTAAGGC TTGGAAACCA GCGATGGCAA AAGGTGTGAT AGAAGCTTCT GTTCCTCCCA 14880 

CCAACATCAC ATCTTGGAAA CCAAACTTAA TGGAGCGGAA GGCATCCCCA ATCGCATCAT 14940 

TTGATGAAGA GCAGGCAGTA TTGATAGATT TACAAACACC GTTTGCACCA AAACCCATGG 15000 

CTACATTCCC AGAAGCCATA TTTGGTAAAG CTTTTGGAAG AGTCATTGGT TTGACACGTT 15060 

TGGGTCCTTT TTCATGAAGG CGAAGTACCT GATCTTCAAT TTCCTTGATT CCACCAATAC 15120 

CAGATGCAAC GATAACACCA AAACGATCCC TATTAAGAGC CTCTACATCA AGATTGGCAT 15180 

GATTTACAGC CTCTTGGGCT GCATACAAGG CATATAAAGA ATAGTTATCA AAACGGTTGG 15240 

TATCTTTTTT TACAAAGTAT TTATCGAACG GAAAATCTTG GATTTCTGCC GCATTATGCA 15300 

CATCAAAGTC ACTATGATCA AATTTTGTAA TGCCACCAAT GCCGATTTTC CCAGTTGCTA 15360 

AACTATTCCA AAATTCTTCT GGTGTATTTC CGATTGGAGA TGTTACTCCA TAACCTGTTA 15420 

CCACTACTCG ATTTAGTTTC ATTCTTTTCA CCTCTAGCTT TCGCTACATA CTTAAGCCAC 154 80 

CATCAATGGC AACCACTTGT CCAGTTAGAT AATCTTGGCC TGCTAAAAAT ACTGTCAAAT 15540 

CTGCAACCTG CTCTGCCTGC CCAAATTCTT TCATCGGAAT CTGAGCTAGT GTAGCTTCCT 15600 

TAATCTTATC TGACAGGATA GCGGTCATAT CAGACTCAAT CATTCCTGGA GCAATCACAT 15660 

TGACTCGTAT ATTCCGACTA GCGACCTCGC GTCCCACAGA CTTGGTAAAG CCAATCAAGC 15720 

CAGCCTTAGA AGCAGCATAA TTAGCTTGAC CAATATTCCC CATCAAACCA ACAACACTAG 15780 

ACATATTAAT GATAGCACCT TCTCTGGCTT TCATCATCGG TTTCAAGACT GATTGTGTCA 15840 

TATTAAAGCC ACCAGTCAGA TTGACCTTGA GCACTTTTTC AAAATCTGCT TCTGTCATCT 15900 

TGAGCATAAG AGTATCTTGG GTAATCCCTG CATTGTTGAC CAAAACATCT ACTGAACCCA 15960 

GTTCTGCAAT AGCTTGATCA ATCATACGCT TAGCGTCTGC AAAATCTGAT ACATCTCCTG 16020 

AAATGGGAAC CACCTTGATA CCATAGTTTG AAAACTCAGC GAGCAATTCT TCTGAGATTG 16080 

CCCCACGACT GTTTAAGACA ATGTTGGCTC CTGCTTGAGC AAACTTGTGG GCGATGGCAA 16140 

GACCAATTCC ACGACTCGAA CCTGTAATAA AGATATTTTT ATGTTCTAGT TTCATTTTTT 16200 

TCCTTTCAAA ACTTCTACTT ATTTTAGTCT ATTTTTCTAA AAGTGCTACT AAACTCGCTT 162 60 

GATCTTCCAC ATGAGCTAAG TGAGCAGTTT GATCAATTTT TTTAACAAAA CCTGACAAGA 16320 

CTTTCCCCGG TCCAATCTCG ATAAAGTTGC TTATGCCTGC TTCTTGCATG ACCCCAATAC 16380 

TTTCATAGAA ACGAACGGGT TCCTTGACCT GACGCGTCAA GAGCTGAGCA ATGTCCTCTT 16440 



WO 98/18931 



PCT/US97/19588 



204 



TTTGCATCAC 


AGCAGCTTCT 


GTATTGCCGA 


CTAGGGGACA 


AGTAAAATCT 


GAAAAACTTA 


16500 


CCTGAGCTAG 


AGTTTCAGCT 


AGTTTCTGGC 


TAGCAGGTTC 


AAGGAGAGCG 


GTGTGAAAGG 


16560 


GACCTGACAC 


CTTAAGAGGA 


ATCAAGCGTT 


TGGCACCTGC 


TTCTTGCAAA 


AGTTCAACCG 


16620 


CTCGATCAAC 


TGCAACCACT 


TCTCCAGCAA 


TGACGATTTG 


TGCAGGTGTG 


TTATAGTTGC 


16680 


CTGGAGTAAC 


CACTCCAAGT 


TCAGAAGCTT 


TTTGACAGGC 


TTCTTCAATG 


ACCTCTACTG 


16740 


GCGTATTGAG 


AACTGCTACC 


ATCTTGCCAG 


AGTCAGCAGG 


AGCCGCTTCT 


TCCATATAGG 


16800 


CTCCACGCTT 


AGCTACCAAG 


GCAACCGCAT 


CTTCAAAATC 


CAAGGCGCCA 


CTTGCCACCA 


16860 


AGGCAGAGTA 


TTCTCCAAGA 


GACAAACCAG 


CAACCATATC 


AGGCTGATAG 


CCCTTTTCTT 


16920 


GCAATAAACG 


GTAGATAGCA 


ACCGAAGTCG 


CTAGAATGGC 


TGGTTGCGTA 


TAGCGGGTCT 


16980 


GATTGAGTTT 


GTCTTCTTCC 


GTATCGATGA 


GATAACGCAA 


ATCATAACCG 


AGCACCTGGC 


17040 


TCGCTCGATC 


AATCGTTTCT 


TTAACAATCG 


GATACTGATC 


ATAGAAATCC 


CGTCCCATCC 


17100 


CTAGATACTG 


GGCACCTTGA 


CCAGCAAATA 


AAAAGGCTGT 


TTTAGTCATT 


TCTTACAACT 


17160 


CCTGTCCAGC 


GAGAGGCTTC 


TTCTTGAATT 


TTCTTAGCGG 


CTCCGTAATA 


CAAATCTTTT 


17220 


AGGATTTCTT 


CAGCTGTTTC 


TTCTTTAGAA 


ACAAGCCCTG 


CGATTTCACC 


TGCCATAACA 


17280 


GAGCCACCAT 


CCACATCACC 


GTGAACAACT 


GCTTTGGCTA 


GAGCACCTGC 


TCCCATTTGT 


17340 


TCAAAGATTT 


CTAAATCAGG 


ATCTTCTTGC 


TTAAAGGCAT 


CTTTTTCAGC 


CAGTTCAAAA 


17400 


TCTCTACTCA 


ACTGATTTTT 


AATAGCACGA 


ACAGCATGAC 


CAAAGTGCTG 


AGCTGAAATC 


17460 


GTAGTATCAA 


TATCCCTTGC 


TTTTAAAATT 


TTCTCCTTGT 


AGTTTGGATG 


GGCATTCGAC 


17520 


TCTTTTGCAA 


CTACAAACCG 


TGTCCCCACC 


TGTACAGCCT 


CTGCACCTAG 


CATAAAGCCA 


17580 


GCCGCAGCAC 


CTTCACCATC 


CGCAATTCCT 


CCTGCAGCAA 


TAACAGGAAT 


AGATATAGCT 


17640 


GTGGCTACCT 


GTCGCACCAA 


GGTCATGGTT 


GTTAATTTAC 


CGATATGCCC 


CCCAGCTTCC 


17700 


ATTCCTTCTG 


CAATAACAGC 


GTCTGCACCG 


ATTTTTTCCA 


TGCGTTTAGC 


TAAAGCGACA 


17760 


CTAGGAACAA 


CAGGAATAAL 






AAfYTTTCCAT 


ATACTTGCTT 


17820 


CGATTTCCTG 


CTCCTGTTGT 


GACAACTTTA 


ACACCTTCTT 


CAATAACGAG 


ATCCACGATG 


17880 


TCTTCCACAA 


AGGGAGATAA 


GAGCATGATG 


TTGACCCCAA 


AGGGTTTATC 


AGTCAATGAT 


17940 


TTGATTTTAT 


CAATATTGGC 


CTTGACAACT 


TCTTTCGGGG 


CATTTCCCCC 


ACCGATAATT 


18000 


CCTAATCCTC 


CAGCCTTGGA 


AACAGCCCCT 


GCCAAATCAC 


CATCAGCAAC 


CCAGGCCATC 


18060 


CCTCCTTGGA 


AAATAGGATA 


ATCAATCTTC 


AATAATTCTG 


TAATACGCGT 


TTTCATAGTG 


18120 


CCTCCAACCT 


TCCTTGCTTA 


CGTAATAGTT 


CGATTTCACC 


ATAATTTGAC 


AGTCAAACTA 


18180 


TTACCTAAAC 


AAGAGGGAGT 


GGCTTTCTCC 


CTACTCCTTC 


TACTAATATT 


CTGCTTATTT 


18240 
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TGCTTGCTCT 


TCAACGTAAG 


CAACCAAGTC 


ACCAACTGTT 


TTCAACTCAT 


TTTCTGCTTC 


18300 


GATTTGGATA 


TCAAAAGCAT 


CTTCGATTTC 


TGAGATTACT 


TGGAACAAGT 


CCAATGAATC 


18360 


TGCGTCCAAA 


TCATCAAAAG 


TTGATTCAAG 


TGTTACTTCT 


GATGCGTCTT 


TTCCAAGTTC 


18420 


TTCAACGATA 


ATTTCTTGTA 


CTTTTTCAAA 


TACTGCCATG 


ATAGGACTCC 


TTTAAAATAA 


18480 


ATAGTTTTTT 


TATAACAATG 


TGTTCACCAC 


ATGATTACCT 


AAATTGTAAG 


AATGAGCGTG 


18540 


CCCCAGGTCA 


AGCCTCCACC 


GAAGCCTGAT 


AGAAGAACAG 


TCTGGCTACC 


ATCTAAAGGG 


18600 


ATGAGACCTT 


GTTCTACACA 


CTCTGAAAGT 


AAAATCGGGA 


TACTGGCTGC 


ACTGGTATTG 


18660 


CCATATTCCA 


TCATATTGGC 


TGGAAGTTTG 


GCTCGGTCAA 


CACCAATTTT 


TCTAGCCATC 


18720 


TTATCCAAAA 


TACGGTCATT 


GGCTTGATGA 


AGTAGCAGAT 


AATCCAAGTC 


TGTCACCTCT 


18780 


ATAGGAGATT 


CATCAATAGT 


CTGCTTGATA 


GACTTGGCTA 


CATCTCGAAT 


GGCAAAATCA 


18840 


AAGACTGTGC 


GTCCATCCAT 


CTTCAAAAAC 


GAATCTGCAC 


TTTCTTGATC 


TGAAAATGGA 


18900 


GAATGTAAAC 


CTGAATGCCC 


ATAAGTTAAA 


CACTCGCTGC 


GACTTCCATC 


GCTATTGAGA 


18960 


CTCTCAGCTA 


AGAAATGCTC 


TTGCTCGCTA 


GCTTCTAACA 


AGACACCACC 


AGCACCATCT 


19020 


CCAAACAACA 


CAGCTGTTGA 


TCGATCCGAC 


CAATCGACTC 


CCTTAGAGAG 


GGTTTCACTA 


19080 


CCAATCACCA 


AGCCTTTTTG 


AAAGCGACCA 


GAAGCGATAA 


ACTTTTCAGC 


AGTTGAAAGA 


19140 


GCAAATACAA 


ATCCACTGCA 


AGCCGCGGTT 


AAGTCAAAAG 


CAAAGGCTTT 


ATTAGCACCA 


19200 


ATATTAGCTT 


GAACACGAGC 


AGCTGTAGAG 


GGCATCATCG 


AATCTGCACT 


AATGGTAGCT 


19260 


AGGATCATAA 


AATCCAGTTC 


TTCTCCTCTT 


ATTCCAGCTT 


TTGCCATCAG 


TTTCTTAGCA 


19320 


ACCTCTGTAG 


CCAAATCACT 


GGTAGATTCT 


GTTCTTGAAA 


TATGCCTTTG 


TCGTATTCCC 


19380 


GTTCGACTTG 


AAATCCACTC 


ATCATTGGTA 


TCCATAATCT 


GAGCCAAGTC 


GTGATTTGTA 


19440 


ACCACTTGCT 


CTGGCACATA 


ATGAGCAACC 


TGACTTATTT 


TTGCAAAAGC 


CATTATTTCA 


19500 


AATCCTCCAA 


AAATTGGTAA 


AGATTAGTCA 


AACCTTTACC 


CATGACAGCA 


ATTTCTTCCT 


19560 


CGCTCATGCC 


ATCAATAATT 


TTTTCTACCA 


TGGCCTTGTG 


GAAGCGTTTA 


TGCAGTCTAT 


19620 


GAATCAAGCG 


ACCCTTCTTT 


GTCAAATGCA 


GATGCACCAC 


ACGACGATCC 


TGTTCTGACC 


19680 


GAACTCGCTC 


AATGTAGCCC 


GG 








19702 


(2) INFORMATION FOR SEQ ID NO: fl 


• 









< i > SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6211 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAAAATTTCC TCTCTTCTCT TGAAAAATTT TGAAAAAATG GTATGATAGT AACAAGTTAT 60 

TTTTAAGAGG AAAGAAAGGG GAATAATGGA GAAAATCAGT TTAGAATCTC CTAAGACGGG 120 

GTCGGACCTA GTTTTGGAAA CACTTCGTGA TTTAGGAGTT GATACCATCT TTGGTTATCC 180 

TGGTGGTGCG GTTTTGCCTT TTTATGATGC GATATATAAT TTTAAAGGCA TTCGCCACAT 240 

TCTAGGGCGC CATGAGCAAG GTTGTTTGCA TGAAGCTGAA GGTTATGCCA AATCAACTGG 300 

AAAGTTGGGT GTTGCCGTCG TCACTAGTGG ACCAGGAGCA ACAAATGCCA TTACAGGGAT 360 

TGCGGATGCC ATGAGCGATA GCGTTCCCCT TTTGGTCTTT ACAGGTCAGG TGGCCCGAGC 420 

AGGGATTGGG AAGGATGCCT TTCAGGAGGC AGACATCGTG GGAATTACCA TGCCAATCAC 480 

TAAGTACAAT TACCAAGTTC GTCAGACAGC TGATATTCCG CGTATCATTA. CGGAAGCTGT 540 

CCATATCGCA ACTACAGGCC GTCCAGGGCC AGTTGTAATT GACCTACCAA AAGACATATC 600 

TGCTTTAGAA ACAGACTTCA TTTATTCACC AGAAGTGAAT TTACCAAGTT ATCAGCCGAC 660 

TCTTGAGCCG AATGATATGC AAATCAAGAA AATCTTGAAG CAATTGTCCA AGGCTAAAAA 720 

GCCAGTCTTG TTAGCTGGTG GTGGAATTAG TTATGCTGAG GCTGCTACGG AACTAAATGA 780 

ATTTGCAGAA CGCTATCAAA TTCCAGTGGT AACCAGTCTT TTGGGACAAG GAACGATTGC 840 

AACGAGTCAC CCACTCTTTC TTGGAATGGG AGGCATGCAC GGGTCATTCG CAGCAAATAT 900 

TGCTATGACG GAAGCGGACT TTATGATTAG TATTGGTTCT CGTTTCGATG ACCGTTTGAC 960 

GGGGAATCCT AAGACTTTCG CTAACAATGC TAAGGTTGCC CACATTGATA TTGACCCAGC 1020 

TGAGATTGGC AAGATTATCA GTGCAGACAT TCCTGTAGTT GGAGATGCTA AGAAGGCCTT 1080 

GCAAATGTTG CTAGCAGAAC CAACAGTTCA CAACAACACT GAAAAGTGGA TTGAGAAAGT 114 0 

CACTAAAGAC AAGAATCGTG TTCGTTCTTA TGATAAGAAA GAGCGTGTGG TTCAACCGCA 1200 

AGCAGTTATT GAACGAATTG GTGAATTGAC GAATGGAGAT GCCATTGTGG T AACAGACC T 1260 

TGGTCAACAC CAAATGTGGA CAGCTCAGTA TTATCCCTAC CAAAATGAAC GTCAGTTAGT 1320 

GACTTCAGGT GGTTTGGGAA CAATGGGCTT TGGAATTCCA GCAGCAATCG GTGCTAAAAT 1380 

TGCTAACCCA GATAAGGAAG TAGTCTTGTT TGTTGGGGAT GGTGGTTTCC AAATGACCAA 1440 

CCAGGAGTTG GCTATTTTGA ATATTTACAA GGTGCCAATC AAGGTGGTTA TGCTGAACAA 1500 

TCATTCACTT GGAATGGTTC GCCAGTGGCA GGAATCCTTC TATGAAGGCA GAACATCAGA 1560 

GTCGGTCTTT GATACCCTTC CTGATTTCCA ATTGATGGCG CAGGCTTATG GTATTAAAAA 1620 

CTATAAGTTT GACAATCCTG AGACCTTGGC TCAAGACCTT GAAGTCATCA CTGAGGATGT 1680 
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TCCTATGCTA ATTGAGGTAG ATATTTCTCG TAAGGAACAG GTGTTACCAA TGGTACCGGC 1740 

TGGTAAGAGT AATCATGAGA TGTTGGGGGT GCAGTTCCAT GCGTAGAATG TTAACAGCAA 1800 

AACTACAAAA TCGTTCAGGA GTCCTCAATC GCTTTACAGG TGTCCTATCT CGTCGTCACG i860 

TTAATATTGA AAGCATCTCT GTTGGAGCAA CAGAAGATCC GAATGTATCG CGTATCACTA 1920 

TTATTATTGA TGTTGCTTCT CATGATGAAG TGGAGCAAAT CATCAAACAG CTCAATCGTC 1980 

AGATTGATGT GATTCGCATT CGAGATATTA CAGACAAGCC TCATTTGGAG CGCGAGGTGA 2040 

TTTTGGTTAA GATGTCAGCG CCAGCTGAGA AGAGAGCTGA GATTTTAGCG ATTATTCAAC 2100 

CTTTCCGTGC AACAGTAGTA GACGTAGCGC CAACCTCGAT TACCATTCAG ATGACGGGAA 2160 

ATGCAGAAAA GAGCGAAGCC CTATTGCGAG TCATTCGCCC ATACGGTATT CGCAATATTG 2220 

CTCGAACGGG TGCAACTGGA TTTACCCGCG ATTAAAAATC CAACTTAAAT TTATTAAACC 2280 

AGCCTAAAAG GCAATAAATA ATAGAAAAGA GAGAAAAGCT ATGACAGTTC AAATGGAATA 2340 

TGAAAAAGAT GTTAAAGTAG CAGCACTTGA CGGTAAAAAA ATCGCCGTTA TCGGTTATGG 2 400 

TTCACAAGGG CATGCGCATG CTCAAAACTT GCGTCATTCA GGTCGTGACG TTATTATCGG 2460 

TGTACGTCCA GGTAAATCTT TTGATAAAGC AAAAGAAGAT GGATTTGATA CTTACACAGT 2520 

AGCAGAAGCT ACTAAGTTGG CTGATGTTAT CATGATCTTG GCGCCAGACG AAATTCAACA 2580 

AGAATTGTAC GAAGCAGAAA TCGCTCCAAA CTTGGAAGCT GGAAACGCAG TTGGATTTGC 2640 

CCATGGTTTC AACATCCACT TTGAATTTAT CAAAGTTCCT GCGGATGTAG ATGTCTTCAT 2 700 

GTCTCCTCCT AAAGGACCAG GACACTTGGT ACGTCGTACT TACGAAGAAG GATTTGGTGT 2760 

TCCAGCTCTT TATGCAGTAT ACCAAGATGC AACAGGAAAT GCTAAAAACA TTGCTATGGA 2B20 

CTGGTGTAAA GGTGTTGGAG CGGCTCGTGT AGGTCTTCTT GAAACAACTT ACAAAGAAGA 2880 

AACTGAAGAA GATTTGTTTG GTGAACAAGC TGTACTTTGT GGTGGTTTGA CTCCCCTTAT 2940 

CGAAGCAGGT TTCGAAGTCT TGACAGAAGC AGGTTACGCT CCAGAATTGG CTTACTTTGA 3000 

AGTTCTTCAC GAAATGAAAT TGATCGTTGA CTTGATCTAC GAAGGTGGAT TCAAGAAAAT 3060 

GCCTCAATCT ATTTCAAACA CTGCTGAATA CGGTGACTAT GTATCAGGTC CACGTCTAAT 3120 

CACTGAACAA GTTAAAGAAA ATATGAAGGC TCTCTTGGCA GACATCCAAA ATGGTAAATT 3180 

TGCAAATGAC TTTGTAAATG ACTATAAAGC TGGACGTCCA AAATTGACTG CTTACCGTGA 3240 

ACAAGCAGCT AACCTTGAAA TTGAAAAAGT TGGTGCAGAA TTGCGTAAAG CAATGCCATT 3 300 

CGTTGGTAAA AACGACGATG ATGCATTCAA AATCTATAAC TAATTAGAAA TATATAGCGC 3360 

TGGAGATGAT TTTATGAAAA AGATTATGAG AAAAATTGCA TCGTTATTAT TGGTTCTAGT 3420 
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TGTATAATGT 


AATTACACCG 


TCGGTAATAG 


TGCTAGCAGA 


CCAAAATAAA 


GCAGATTGGT 


3430 


CGTATGATGA 


AAATGCTGTA 


ATTAACATTT 


ATGATGATGC 


TAATTTTGAA 


GATGGTAGGT 


3540 


TGCATATGAA 


CTTTGAACAA 


TTCTTCAAAT TGGCACAAAT 


AGCTAGAGAA 


GAAGGTCTTG 


3600 


AAATTT 1 A TTC 

A A ^^X^ A 4 W 


TCCGTTGAG 


AGAGCTGGTG 


CGACTAAATC 


TGCTCGTTAT 


ATAGCGAAAT 


3660 


GGATTTTGAG 


AAATAAAAAA 


CATTAACAAA 


TATAGTTGGT 


AAATCATTAG 


GACCTAAATC 


3720 


AGCTGTT AG A 


TTCGGAGAAG 

A A. ^» WnUrUlW 


CTTTATCCTA 


TATTGAAGGT 


CCTCTTCGCA 


GAATAAATGA 


3780 


GACGATAGAT 

\«j j* V,* * nun a 


GGCGGTTTAT 

UUVwv A A A ' k A 


ATCAAATAGA 


GCAAATTATT 


GCATCTGGAT 


TGAAAGAATC 


3840 


GGGTTTAAAT 


GACTGGACTG 


CGAAAACTTT 


AGCTTCAGCT 


ATTCGTGGGA 


TATTAGATGT 


3900 






CATATGAATA 


TTACCAATTT 


GTTTTCTATC 


AAGACAGGAT 


3960 


dfTCL atvia A AP 


TGATAGGPAA 


CTGCAAAAAC 


TATTTTTTCA 


GTTGGATTTA 


CAATTGGGAG 


4020 






AAATTAGATT 


CTAATTTTGT 


TCCTCGTACT 


CAATTTGTAG 


4080 


nLALu 1 1 vjUrt 


TTTnAATGAT 


GTAGAATATA 


AAGAAATTTT 


AAACTATTTT 


ATCTTCCATC 


4140 


Vj 1 An 1 OA 1 Avj 




TTGGTAGAAT 


GGTTATATGA 


TTGGATTTCC 

A- A * A, A • 


ACAAATCGTT 


4200 




<y> ?v » ifirTTT 
1 A/\/\u>A\» 111 


TCGATTCGTA 


TGGCTCATAA 


ATACCATGAA 


AGTGTTACTG 

m m m> mt m mv^m m ^mr 


4260 


A & *I — 1 *M"T* f~* 
f\J\Ki I ( 1 t L.lilj 


f\\jf\ 1 ut/\r\ 1 /vr\ 


CTAAAAAACA 


GTCATTAGTG 


ACTGTTTTTT 

4 A A * * * » 


ATAGAAAAAG 


4320 


nUu 111 1AIA 


l\j 1 1 SVt^j 1 


AAAAGATATA 


ATCAAGGCTC 


ACAAGGTCTT 


GAACGGTGTG 


4380 


(TTTr^TP 

It 1 i VJ 1 VJ/\M 1 f\ 


rrppArrnfiA 

l_ 1 Vw 1 Uun 


TTACGATCAT 


TATTTATCGG 


AGAAGTATGG 

■ W*m¥mr mm} M/^mT 4 * * * ^mr 


TGCTAAGATT 


4440 


1 Ai < 1 unnnn 


A A<TA A A ATPf* 


CCAGCGTGTT 


CGCTCCTTTA 


AAATTCGTGG 


TGCCTATTAT 


4500 






CGAAGAACGT 


GAACGTGGGG 


TAGTCTGCGC 


TTCTGCGGGA 


4560 




AflftfiAGTAGP 




AATGAAATGA 


AAATTCCTGC 


TACTATCTTT 


4620 


ATtVPf* ATT A 


rr APftfPACA 


ACAAAAGATT 


GGTCAGGTTC 


GCTTTTTTGG 


TGGGGATTTT 


4680 


fTT A APT ATT A 


AACTACTTGG 


AGATACCTTT 


GATGCCTCAG 


CCAAAGCAGC 


TCAAGAATTT 


4740 


ACAGTCTCTG 


AAAATCGTAC 


CTTTATTGAT 


CCTTTTGATG 


ATGCTCATGT 


TCAAGCAGGT 


4800 


CAAGGAACAG 


TTGCTTATGA 


GATTTTAGAA 


GAAGCTCGAA 


AAGAATCGAT 


TGATTTTGAT 


4860 


GCTGTCTTGG 


TTCCTGTTGG 


TGGTGGCGGT 


CTCATTGCCG 


GGGTTTCTAC 


CTATATCAAG 


4920 


GAAACAAGTC 


CAGAGATTGA 


GGTTATCGGA 


GTAGAGGCGA 


ATGGAGCGCG 


TTCCATGAAA 


4980 


GCTGCCTTTG 


AGGCTGGAGG 


TCCAGTAAAA 


CTCAAGGAAA 


TTGATAAATT 


TGCTGATGGG 


5040 


ATTGCTGTGC 


AAAAGGTAGG 


TCAGTTGACC 


TATGAAGCAA 


CTCGTCAACA 


TATTAAAACT 


5100 


TTGGTAGGTG 


TCGATGAGGG 


ATTGATTTCT 


GAAACCTTGA 


TTGACCTTTA 


CTCTAAGCAA 


5160 


GGGATAGTCG 


CAGAACCTGC 


TGGAGCGGCT 


AGTATCGCCT 


CTTTAGAGGT 


TTTAGCTGAA 


5220 
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TATATTAAGG GGAAAACCAT TTGTTGTATC ATTTCTGGAG GAAATAATGA TATCAACCGT 5280 

ATGCCAGAAA TGGAAGAGCG TGCCTTGATT TATGATGGTA TCAAACATTA CTTTGTGGTC 5340 

AATTTCCCAC AACGTCCAGG AGCTTTGCGT GAGTTTG7AA ATGATATCCT GGGGCCAAAT 5400 

GATGATATCA CACGTTTTGA GTATATCAAA CGAGCTAGCA AGGGAACAGG CCCAGTATTA 5460 

ATTGGGATCG CTTTAGCAGA TAAGCATGAT TATGCAGGTT TGATTCGTAG AATGGAAGCT 5520 

TTTGATCCAG CTTATATTAA CTTAAATGGT AATGAAACGC TTTATAATAT GCTTGTCTGA 5580 

GGACTAATAA AAAAATATCA TACCTTCATT TTGATTTCCT ATCTATTGAC AAGCATAGTC 5640 

ACACTGTCTT TAATACTCTT CGAAAATCTC TTCAAACCAC GTTAGCTCTA TCTGCAACCT 5700 

CAAAACAGTG TTTTGAGCAA CTTGCGGCTA GCTTCCTAGT TTGCTCTTTG ATTTTCATTG 5760 

AGTATAAGGT ATGATTTGAT TTCTTTTTGT TGACAAATAT ACTATATTAA AAAGATATAT 5820 

AAGTAATTAA CTGAGCTTAT CTGTCTTGTC ATCTCTATTA AGGATGGTTT AGATAATCGG 5880 

GTGTCTGCTT CTACGCTAGC ACCTCAATAT CCAAAGGACT GATGAATTTG AAGGACATAA 5940 

GGAATACCTA TCTCTCAGAT GATTTATTGA GGAAGAAAGA TAGGAGTTTT TGAGCTAGTG 6000 

AAGGCTTGGA TTTCTAAAGG TTAGAACTAT CATCTTCAGT TCTTAAATCG AAGAAATAAG 6060 

CTATCTTACG GAAATAGAGA AGCATTTTTT AAGAACTTGA ATAATTTCGC ACCTTAAGAG 6120 

CGTAATAATA CAGTATTTTT ATTAGCAAAT ATTTATGGTG TAGAGGCTAG CAAAACCTAT 6180 

ATATTATCGG ATTTAAAAAG GAAGTAAGAA A 6211 



(2) INFORMATION FOR SEQ ID NO: 9: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7939 base pairs 

(B) TYPE: nucleic acid 
IC) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CCGGACTCCC CACGATTCTT CAAAATAACT GAGTATATTT CTATCTTGAT TTTCAGATAT 60 

AAATTCTTCC TTCTGTGGCC TCTTCTTACG CTTGAGAAGA GCTTCTCCGA CATGGCTTCT 120 

TCCTTACTGA GCAAAACCTT GAGCATAGAT AAGTTTGACT GGCAAGCGTG CTCTTGTATA 180 

TTTGGCTCCC TTCCCACTAT TGTGGATAGC GAGGCGTCTT CTCATATCAG TCGTATAGCC 240 

TATATAGTAG GATCCATCAC GACACTCCAG AACGTACATA TAAGCCTTAT GATCCATAAT 300 

AAATCTCTTC GATTTCGGGC CTATAAGAGC CAT CATC ATT GTGGACAATC AAAGGAGGTA 3 6C 
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AGACCTTAAA GCCACTTGTT GAGCCATCCT TGATCGCCTC AATCAAAAGC ATATTGGCTT 420 

CCTTTTCTCT TTTTGGATAA ACAAACTGCA GGCGCTTAGG GGCTAGATTA TGTCGTTTTA 4 80 

ACGTATCCAA AATATCCAGA AGTCGATCAG GACGATGAAC CATGGCCAAA CGCCCATTAG 540 

ACTTGAGAAT ACTCTGGGCA CTACGACAGA TTTCTTCCAA ATTAGTCGTG ATTTCGTGTC 600 

GAGCCAAGAG ATAATGTTCA CTCTCGTTCA GATTAGAATA AGGATTCACC TTGAAATAGG 660 

GTGGATTACA CAAAATCATA TCCACCTTAC TCCCCTGAAT GTGAGCAGGC ATATTTTTCA 720 

AATCATCGCA GATGACCTGC ATTTGCTCCT CTAATCCATT CAAACGGACA GAGCGTTCAG 780 

CCATATCCGC CAAACGCTCC TGAATCTCAA CAGACAATAT CTGTGCTTGA GTACGAGTGC 840 

TAGCAAAAAG CCCCACTGCT CCATTCCCAG CACAGAAATC CACAATCAAC CCCTTCTTAG 900 . 

GAAAACGTGG AAATCGTGAT AAGAGAACAC TATCCACCGA ATAGCTAAAA ACCTCTCTAT 960 

TTTGAATGAT TTTGATATCT GTCGAAAAGA GCTGGTTAAT GCGCTCTCCT GATTTTAATA 1020 

ATTGTTCTTC TTCCATGGTC CTATTATAGC AAATTCATAT TAACATTACA AAAAATATAA 1080 

AACTCTAAAC TACTTCTTCT TTTTTAAATG GTGCAGGGCT TCTCCAGTCC AGATTGGTAG 1140 

CATTCGTCGA AAGGGAGCAA AGCCGTAGTT AAAGCGGTCG CTTGAAAAGC GTCTCCGTCT 1200 

AGGAAACTGG TACTTTTCTT CCTCCAAAGT GCGGATAGAA AGACTGGCTT TCCCTGTAAA 1260 

TTCATCTAAA TCCACTACCT GAACTTGAAC CTCTTCATCG ACTTTCAAGG TTTCATGAAT 1320 

ATTTTCAATA AATCCTGTCC GAATCTCTGA AATGTGAATC AGCCCCGTAT CACCCGTCTC 1380 

TAACTCAACA AAGGCACCGT AGGGCTGAAT CCCTGTAATA CGCCCCTTTA GCTTATCACC 144 0 

GATTTTCATC TTAGTCCTCG ATTTCAATAG TTTCAATTAC AACATCTTCA ACTGGCTTGT 1500 

CCATAGCTCC TGTCTCAACA GCAGGAATGG CATCCAAGAC AGCGTAAGAT GCTTCATCAG 1560 

CTAACTGACC AAAAACCGTG TGACGGCGGT CTAGGTGAGG TGTCCCACCT TGATTGGCAT 1620 

AGATTTCTGC AATCGGTTCT GGCCAACCAC CACGAGTAAT TTCTTTCTTA GAATAAGGTA 1680 

GGTGTTGGTT TTGCACGATA AAGAACTGGC TGCCGTTGGT ATTTGGACCA GCATTTGCCA 1740 

TGGAAAGAGC ACCACGGATA TTGTAAAGCT CTTCTGAGAA TTCATCCTCA AAAGATTCGC 1800 

CGTAGATTGA CTCGCCACCC ATACCAGTTC CAGTTGGGTC TCCACCTTGG ATCATAAAGT 1860 

CCTTGATAAT ACGGTGGAAA ATGACACCAT CATAGTAGCC ATCTTTTGAA AGAGATACAA 1920 

AGTTAGCCAC TGTTTTAGGA GCATGTTCAG GGAAAAGCTT GATACGTAAG TCTCCGTGAT 1980 

TGGTCTTAAT AGTCGCAAGA GGACCTTCTA CTGTTTCAAT GTCTACTTGT GGAAAATGCA 2040 

ATTCTTTTTC TACCATACCA AATACTTCTA AGGCAGCAAA AATGCCATCT TCTTCTAATG 2100 

TTTTTGTAAT ATAATCTGCT TTTTCTTTGA TTTTATCATG AGAAATTCCC ATGGCAACGC 2160 
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TGATTCCAGC ATAATCAAAG AGTTCCAAGT CGTTGAGACC ATCTCCAAAA ACCATGACCT 2220 

TCTCTGGTTT CAAGCCAAGG TGTTCCACAA CCTTTTCCAC CCCCGTCGCT TTGGAGCCTG 2280 

AAATCGGCAC AATATCAGAC GAATGTTGAT GCCAACGAAC CATGCGAAGT TTGTCTGAGA 2340 

GACTGTCAGG CAAGTGCAAG TCATCTCCCT TATCTTCAAA AGTCCACATC TGATAGATAT 2400 

CTTCTTTTTC ATGGAAATCG GGATCTACAT CTAAGTCGGG ATAAATTGGA TTGATAGCTT 2460 

CACTCATCAT ATCGGTGCGA GTCGACAACT TGGCATCATG ACTCCCAACC AAGCCATACT 2520 

CAATTCCTTC TTGCTTAGCC CAAGAGATAT ACTCCTCAAC ATCTGACTTT TCAATCTGAT 2580 

GCTGATAAAT GACCTGACCT TTTTTATCTT CGATATAAGC CCCATTCAAA GTTACAAAAA 2640 

AGTCAGGCTT GAGATCACGA ATCTCTGGAA CAACACCAAA AATGCCACGT CCAGAGGCGA 2700 

TTCCTGTTAA AATTCCTTTT TCACGCAACT GTTTAAAAAC AGTGGGAATT GTAGTTGGAA 2760 

TAAACCCTGT CTTTGAATTC CGCAATGTAT CATCAATATC AAAAAAGACA ATCTTGATCT 2820 

TCTTTGCCTT GTATCTTAAT TTCGCGTCCA TCTCACTACC TCTTTCAATC TAACTCTTTC 2880 

C ATT AT AT C A TAAAGTAGGC AAATCCCCTA TTTTCAAAAA GTTTATCATT TTTATTTTAA 2940 

TTTCTTGGAT GAGAAAAGAG ACATATTTAT GAAAAAGCTC CATCGTGCTT TTAATGTGTT 3000 

CTCTTGTTTT CAAACTCGTA AAAAGGGAGC CACTGATCCT AACTCGCTCT CTCATTTCAA 3060 

AGCTTGTGAA AAAAGACCCG TTGGGGTCTT AATTCGCTTT CTTGTTTTCA AGCTCATGAA 3120 

AAAGAGACCC AACTGGGTCT TTTCTTTAAT CTTCGTTTAC GAAAGGCATC AAAGCCATTA 3180 

CGCGAGCGCG TTTGATAGCT GTTGTTACTT TACGTTGCTT TTTAGCTGAA GTTCCTCTTA 3240 

CACGACGAGG AAGGATTTTC CCACGTTCTG AAACGAAACG GCTAAGAAGC TCAGTATCTT 3300 

TGTAATCAAC ATATTCAATT TTGTTTGCTG CGATGTAATC AACTTTTTTA CGGCGTTTGA 3360 

ATCCGCCACG ACGTTGTTGA GCCATGTTTT TTCTCCTTTA TAAGTTTAGT TGTCCATTAG 3420 

AATGGTAAAT CATCATCTGA AATATCCAAT GGGTTTGTTG CTCCAAATGG ATTTTCATTA 3480 

CGTGAAAAGT CTGGTACTGA ATTTGTAGGT GCTGAATAGT TTGCAGTTGG TGCAGAGTAA 3540 

GCTCCACCTG TGTGACCCTC ACGCACACTA CGGCTTTCCA ACATTTGGAA ATTCTCAGCC 3600 

ACGACCTCTG TCACGTAGAC ACGTTGTCCT TGCTGGTTAT CGTAACTACG AGTCTGGATA 3660 

CGACCTGTCA CCCCGATAAG TGAGCCTTTT TTAGCCCAGT TAGCAAGATT TTCAGCCTGT 3720 

TGGCGCCACA TAACGACATT GATAAAATCA GCCTCACGTT CACCATTTTG ACTCTTAAAT 37 80 

GTACGGTTTA CTGCAAGAGT AAAAGTCGCA ACTGCTACAT TTGATGGGGT ATAACGCAAC 3840 

TCAGCGTCAC GTGTCATACG CCCTACAAGT ACAACATTGT TAATCATAGT TTACCTTCTT 3900 
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ACGCGTCAAT TTTGACGATC ATGTGACGAA GAATGTCAGC GTTGATTTTT GAAAGACGGT 3960 

CAAACTCTTT AAGAGCTGCA TCGTCATTTG CTTCAACGTT AACGATGTGG TAAAGTCCTT 4020 

CACGGAAATC TTGGATTTCG TATGCAAGAC GACGTTTTTC CCAAGTTTTT GATTCAACAA 4080 

CAGTTGCACC GTTGTCAGTC AAAATAGAGT CAAAACGTGC TACCAAAGCG TTTTTAGCTT 4140 

CTTCTTCAAT GTTTGGACGA ATGATATAAA GAATTTCGTA TTTAGCCATT GATATGTTCC 4200 

TCCTTTTGGT CTAATGACCC CAAGACTTTG CAAGGGGTAA GTGAGGTTCG CTCACAATAA 4260 

ACT ATT AT AC TAGAAAAAAT TTTTTTACGC AAGTAAAAAC ACTAGAATTC GAAAAAACGC 4 320 

CACATGGGCG TTTTCCTGTT CTTATGGTTT GATACGGTGC AACATACGTG GGAATGGAAT 4 380 

AGCTTCACGG ATATGTTTTG TTCCTGCTGC GAAGGTTACC ATACGTTCGA TACCGATACC 4440 

AAATCCTCCG TGTGGAACTG TACCGTATTT ACGAAGGTCA AGGTAGAATT CATATTCTGT 4500 

ACGATCCATG CCAAGTTCAT CCATCTTAGC GACAAGGGCA TCGTAATCTT CCTCACGCAT 4560 

AGACCCACCG ATAATTTCTC CATAGCCTTC TGGAGCAAGC AAGTCTGCAC AAAGCACGCG 4620 

CTCTGGATTT CCAGGAACTG GTTTCATGTA GAAGGCCTTG ATGGCTGCTG GATAGTTCAT 4680 

GACAAATGTT GGCACACCAA AGTGGTTTGA AATCCAAGTT TCGTGTGGTG ACCCAAAGTC 4740 

ATCACCATGC TCAAGATGCT CGTAGTCAGC ATCTTCATCA TTTTCATGCT CTTGCAAGAG 4800 

GTCAATGGCT TGATCGTAAG TGATACGTTT GAATGGCTCT GCAATGTAGC GTTTCAAGAG 4860 

TTCTGTATCA CGTTCCAAGG TTTCCAAGGC TTGAGGCGCG CGGTCAAGAA CACCTTGTAG 4920 

AAGAGCTTTC ACATAAGCTT CTTGCAAGTC AAGCGACTCA TCATGTGTCA AGTATGAGTA 4980 

CTCAGCATCC ATCATCCAGA ACTCAGTCAA GTGACGGCGT GTTTTTGATT TTTCAGCACG 5040 

GAAAACTGGA CCAAAGTCAA AGACACGACC AAGAGCCATA GCCCCTGCTT CTAGGTAAAG 5100 

CTGACCTGAT TGGCTCAAGT AGGCTGGCGT TCCGAAGTAG TCAGTTTCAA AGAGTTCTGT 5160 

AGAATCTTCT GCCGCATTTC CTGAAAGAAT TGGGCTGTCA AACTTCATAA AACCGTTCTT 5220 

GTCAAAGAAC TCATAAGTTG CATAGATAAT AGCGTTACGG ATTTGCAACA CAGCTACTTG 5280 

CTTACGAGAG CGTAgCCACA AGTGACGGTT ATCCATCAAA AAGTCTGTTC CGTGTTCTTT 5340 

TGGTGTGATT GGGTAGTCTT GAGATTCACC GATCACTTCG ATGTCTGTGA TGTCCAACTC 5400 

ATAGCCAAAT TTAGAACGTT CGTCCTCTTT GACAATACCT GTCACATAAA CAGACGTTTC 54 60 

TTGGCTCAAG CGTTTGATAA CATCAAACTT CTCAAGTCCC ACTTCTTCAC CAAATTTTTC 5520 

GACAAAGTTT GGTTTAAAAG CCACACCTTG AAAGAAGGCT GTTCCATCAC GCAATTGTAA 5580 

GAAAGCGATT TTTCCTTTTC CTGATTTGTT GGCAACCCAA GCGCCAATCG TCACTTCCTG 564 0 

ACCAACATAG TCTTTTACGT CAATAATCGT TACACGTTTT GTCATTATTT TTCCTTTTCT 5700 
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TTTTTATTCT TTATGGCAAA CCACCTCTAT ATTGTTCCCA TCCAGGTCAA TCATAAAAGC 5760 

AGCATAGTAA ATCGGATGCT CACTTCGATA ACCAGGAGCC CCATTGTCTC GCCCACCTCC 5820 

CTCTAAGCCA GCCTCATAAC AAGCCTGAAC TTCTTCCTTA TTTTCTGCTA AAAAAGCAAA 5380 

ATGAACAGGA TCTTGTGTTC CCTGAGTCAG CCAAAAATCA CCACCAGGAT GAGGGCTGTT 594 0 

CGGGGATAGA AAACTAATTA GAGAACTAGT CTTAAAAGCC AATTTATAGT CCAAAGGAGC 6000 

GAGAAAACTC CTATAAAATC CTTATGAAAT TTGTAAATCC TTTACCTTAA TCTCAAAATG 6060 

ATCAATCATT CTCACTACCC ATAAATGCTT TCAAGCGTTC GACTGCTTCT TTAAGCGTGT 6120 

CTAGGTCTGT CGCATAGCTG AGGCGGACAT TTTCTGGTGC TCCAAATCCA GCTCCTGTTA 6180 

CCAAGGCCAC TTCGCCTTCT TCTAAGATAA CAGTTGTAAA GTCTGTCACA TCCGTGTAGC 6240 

CTTTCATCTC CATGGCCTTT TTGACATTTG GGAAGAGATA GAAGGCCCCT TGCCGTTTGA 6300 

CCACTTCAAA TCCTGGTACC TCTGCAAGGA GGGGATAGAT GGTATTAAGA CGTTCCTCAA 6360 

AGGCCTGACG CATGCTTTCT ACAGTATCTT GCTCACCTGA TAGAGCCTCA ACTGCTGCAT 64 20 

ATTGGGCTAC TGCTGACGGA TTCGAAGTTG TTTGACCTGC AATCTTGGAC ATGGCAGCGA 64 80 

TAATGTCTGC TTCTCCAACG GCATAACCAA TCCGCCAACC AGTCATGGCA TAAGTTTTAG 6540 

ACACACCATT GATGACCACT GTTTGCTTGC GAATCGCTTC CGATAGGCTA GAAATCGGTG 6600 

TGAACTCATG ACCATTATAA ACCAAGCGGC CATAGATATC GTCTGCTAGG ATGAGAATAT 6660 

CATTTTCTAC AGCCCAGTTT CCAATTGCCA AGAGTTCCTC ACGGGTGTAA ATCATACCTG 6720 

TGGGATTAGA TGGCGAATTC AGCACCAAAA CCTTGGTCTT GTCAGTGCGA GCTGCTTCTA 6780 

ACTGCTCTAC GGTCACCTTA AAGTGATTGT CTTCCTTAGC AGAAACAAAG ACGGGAACGC 6840 

CTTCTGCCAT CTTGACCTGA TCTCCATAGC TAACCCAGTA TGGGGTTGGG ATGATGACTT 6900 

CATCACCTGG ATTGACCACA GCCATAAACA AGGTATAGAG AGAATATTTG GCTCCCGCAG 6960 

CGACTGTCAC TTGATTTGAC GCTACAGAAT ACCCGTAAAA GCGCTCAAAG TAGCTATTGA 7020 

CCGCCGCCTT AAGCTCTGGC AGACCTGAGG TTACTGTATA AAAAGAAGCA CGCCCATCTC 7080 

GAATCGATGC AATGGCGGCA TCTTGGATAT TTTTGGGAGT AGTGAAATCT GGCTCACCCA 7140 

AGGTTAGAGA CAAAATATCT CTACCCTCAG CCTTCACTGC TTTGGCACGG GCTCCAGCAG 7200 

CCAAAGTCAC ACTTTCTTCC ATTTCTAAAA CACGGTTGGA TAGTTTCATA GGCCCTCCTT 7260 

GTTGACCAAT GCTCCTGTTT CAAAATCTAC TAGATAAAAA TCAGATCCTG ACTTAACTTC 7320 

CCAGATTGGC TTATCTTGAT AACGGCCAAA GGTTATCTTC TCAATCTCGC CAGCTCCCTT 7380 

TTCCTTAGAA ACCGTTTCTG CTTTTTCTTG TGAAACACCC TGATTTAGCT GATAAACGTA 7440 
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AATCTTATCG TCATCTTTAC CAATCAGGAC AGCAAGCGCT TCTTGCTCTT TGTTACGACC 7 500 

AAGAACGCTG TAATAAGATT CCAAGCCATT GTATAAATCA ACCTGATCAG CCTGCTCTAA 7560 

TCCTGCATAC TGCTGAGCTA ATTTTTCTCC TTCACTTTTA GCTGTTTGAT AGGGTTTCAT 7620 

GCTAAGAGAA ACCATATACA GAAAGGAACC ACTGATAACC ACAAACAAAA TCGTCATCCC 7 680 

TAGACCATAC TGCCACAGTA GATTATTTTT TGCTTTGTTT TGTCTTTTTT TCACTCGTCT 7740 

ATTTTACCAT CTATTAAGCT TTATTACAAG TGAATATAAG AATACTCTTC GAAAATCTCT 7800 

TCAAACCACG TCAGCTTTAT CTGCAGACCT CAAAGCTGTG CTTTGAGCAA CCAATTCTAT 7860 

TTCTCCCTTC AAACAAAACC GATTTTGAAA GTGAAACAGT TCTTACTTTT TCAGTCACAA 7920 

ATGATTAGAG TTTGCCGGG 7939 
(2) INFORMATION FOR SEQ ID NO: 10: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9897 base pairs 

(B) TYPE: nucleic acid 
(CI STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CCGCTCTACC GTCAAATAAT TACCATTTTG TTTAATACCG AAATTTTTAT CTACTGAAAA 60 

TTCAGTTGGT CTGTTGGTAC GATCGTCGTA TACAGTACCA TTCTCACGAA TAGTATAATT 120 

GTAATCAGTA TCACCTTGTT TCCTTAATTT AAGGTAATAA TTACCATCAA TTTGTTTATA 130 

ACCTGAATCT TTTCTAGTTG CTTCTCTAAA ACTTACTCCA GCAGGCATCA CATCAGCAAA 2 40 

CATGAGTACT TGTTTGTTCT TTTTTTCAAC AATAACAGAG TCAATATAGG TTGCACCACC 300 

GCTGATTTGT AAGTCACGTC CACCAACTTC ACGAGGCCAT TCTAATGGTA CTGGCGCAAA 360 

ATCATCGAAT GCCAATGTTA ATTTTGGTTT AGTCCATGTC TTACCATTAT CATCACTATA 420 

ACTTGTAGCA ATATTAATTT TATTCAAGAA ATCATGAGTT CCACCGTAAC GAGCGTCAAT 4 80 

GCTTGAAAAT ACCCGACCAT TGCTAAAAGT ATACAGAACT GGAATACGGA AATAGTTAGA 540 

ACCTGTTGTA TCATTAGCCG TATAAATTAA ATGTCCAGTA ACAGCGTTTG TTGTCATCTT 600 

TTTAACAGTT TCTTCATCCA ATGCACTATT AAAGAATTTG ATATTTTCTA GTGTTCCGTT 660 

AAAACCAAAC GCCGTTTTTC CTGCACGTTT CACTCCCCCA AGCATATAGT AATCAATACC 720 

TTTAATATCC TTGATGTTTA GGAAATTATC CACTTTCTTT TCTACTACTT TTGTACCATT 780 

TGCGTATAAA GAATATGTTT TTTTGACTGA ATCTGCTACT ACTGCAACAG TGTTAGTCAC 84 0 

AGCCTCTTGT TTGTACTTAC CCCAAACTGA AGCAGGTCTG GATACTAGGT TATTTTTATT 900 
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GGAAGAAGTA TCACGCGCTT CCATCCCCAA CTCACCATTG TCTCTAAGGA ACACATCTAC 960 

ATAACTATTT TGTTGACCGG CTTTGGAATT AGATATTCCA AACAGACCTT GTAAGCCTTT 1020 

CTCACTTGAC TGATTGTACT TAATCACTAC AGTAAAGTCA CCGCTAGTAA ATTTATCCTT 1080 

TAACTCTTTA GTAACATTTT CTCCGCCCCC TGTTAAAGTA ACATTATTTT TTTCTAAGAC 1140 

AGGAGTTTCT TCCGCTGTAG AAGATGGATC CTTAACAGTA GTTTCAACTG TTCGAGGTTG 1200 

TACAGTAACT TCCGAAGAGT TATCCGATGT AGGTTGTACT TCCGAAATCG GAGTCGTTGG 1260 

TGCAACAGGT TGCACCAACT TTGGTGTTGA TACTTCAGAA GTTTCAGTCT CCTGAGCTGC 1320 

AACTGAGTTA GCAACAAATG CTGATAATAC CACTACAGTA CCTAAGGTTA CATATTGTTT 1380 

AATATTTTTT TTCATTTTAT TTTTCCTCGT TTAAAACTTT GATAACAAGT TTTTTAACAG 1440 

TTTCATCATT GCAATGAATC TTTGGTTGGT GAAGATCTTC TTCAAAAGTC ACCAACATAT IS 00 

TCCCTGGAAG CAATTCAACA ATTTGATAGT CTTTGCTATC GTAAAAAGCA ATATCCTTCT 1560 

CTTCGCTAAA AGGTACACGT GACTGGGCAC GAACTGGGGA AGTTACTGCC ATTTTTTCAG 1620 

TATTTTCAAC AACAATATGA ATATCTAAAT ATTTCTTATG AGTTTCAAAA ATATCTCCTG 1630 

GAACTCCATC AGCTAGATAA GTCATACAAT TTGCAAAAAC ATTTTCCCCG TCAATATCAA 1740 

TTTTTCCATC AACTAAATCT GTCAAATTTG TATTTTCTAA AAAATCACAG ACTTTTGAAA 1800 

AATATTTATT GACAGAAGCA TATCGTTTAA AATCAGATTG TTCAGAAATA ATCATATTAT 1860 

TTTCTCTTTT CTATTAGTGA CGAACTTCCC AACTTGAATC CGCTTTAATT TCTGTAATAT 1920 

CATGAATCGT TGTATATTTA GGTGCAGATA CTTTATTTCC AGTAAGAACA GATACAATAT 1980 

AACCTGAAAC TACTGATACA CAGATTGAAA TCAATGAATA TGCCCAGTAG CTAACAGCTC 2040 

TTGGAGGAAG GAAGTATTTA ATAAATACCA TGACGATGGT TGATACAATC AGCGCTGCAT 2100 

AAGCACCTTG TTTATTTGCT TTTTTAGAAA CAAATCCAAG AATAAATACA CCACCAACTA 2160 

GACCAAGTAC AAGTCCCATG AAACTATTGA ACCATTCGTA TGCAGATTTA ATATCTGAGT 2 220 

GAGCCATGAC AATGGAAACA CCAATTGAGA ATAAACCTAC TGCTAGAGAT ACGAATTGTC 2280 

CAATTTTCGT ACGACGATTC TCTGACATAT TTTTAGAAAT GACATCTTGA ATATCCAATG 2340 

TCCATGAAGT TGCAACAGAG TTCAAACCTG TTGAAATAGT TGATTGAGAT GCTGCATAAA 2400 

TCGCTGCCAA GATCAAACCT GTGATACCTA CTGGTAACTG GTATGCAATA AAGTACATAA 2460 

AGATTTGGTC TTGAGGGATA TTGCTAGCTG CACTATCTGC ATTTTGTACT TGATAGAATA 2520 

CGTACAAGCC TGTACCAATC AAGTAAAAGA CTGTTGCAGT TGCAAGTGAC AAAACACCGT 2580 

TTGTGAACAA CATCTTATTA AGTTTCTTAA TATTTTGTGT TGTAGTAAAA CGTTGAACCA 2640 
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AAT CTTG AG A TGAAGCATAG GAAGACAAGA TTGTAAAGCC TGAACCCATC ACAATTAAAA 2700 

AGATGGAGTT TGAAAGCAAG TTAGGATCGA AAAGTTTTTC ATTTGCAGCA AGGAATTTCC 2760 

CGTTTGCTAA TGTTTCTGCT ACTGCACCAA AGCCACCTTT AAT ATT AG CA ATCAGTACAA 2820 

ATAAAGCTAA AACGACACCA CTAATCAGAA TCACACCTTG AATAAAGTCT GTCCATAATA 2880 

CGGATTTTAG ACCACCAGTA TAAGAATAAA CAATTGCAAC TACACCCATC AAAATAATCA 2940 

AAATATTGAT GTCAATTCCT GTCAATACTG ATAAACCAGC TGATGGGAGG TACATAATGA 3000 

TAGACATACG TCCCAATTGA TAAATAATAA ACAAGAGTGC TGAAATAATA CGAAGTGCTT 3 060 

TAGAATTAAA ACGTTTATCC AAGTAATCAT ATGCCGTATC GATGTCTATC CGTGCAAAGA 3120 

TAGGTAACAT AAAACGAATT GTCAGTGGAA TAGCTACTAC CATCCCTAAT TCAGCAAACC 3180 

ATAAAATCCA GCTACCTGCA TAAGAGCTAC CAGCGAGTCC CAAGAAGGAA ATCGGACTGA 3240 

GCATTGTGGC AAAAATGGAT ACCGAAGTAA CATACCAAGG AACCGAACCA TCTCCTTTAA 3300 

AGAACTCTTT TCCTTTCATC TCTTTTTTAG AGAAATAGAT ACCTGCAACC AACACCGCAA 3360 

GTAAATAAAC AATCAAG ATA ATTAAGTCAA TTATTGTAAA TCCTGTTGTG CCCATAACAT 3420 

ATCTCCATAT TGATTTTATT TATTATAAAA ATTCTTTTCG TGCTTGTTGA ATAAGTTCTG 3480 

CTGCTTGTTT TGCAACTTCC AAGTCACCTT CTGCCAATGC TTCTAAAGGT TGACGAACAG 3540 

AACCTAAATC AAGTTTTTCA TTTAGACGCA AAACTTCTTT TGCTACAGCA TACATATTTG 3600 

CCTTACCTGA TATCATCTTA TAGATAACTT CATTGATAGC ATATTGAAGT TTTTTAGCTG 3660 

TATCTAAATC TCGTTCTTGA ATCAAACTTT CCAATTTCAA GAACAAATCT GGCATAACGC 3720 

CATAAGTACC ACCAATACCA GCTTCTGCTC CCATCAAGCG ACCACCAACA TATTGTTCAT 3780 

CTGGACCATT GAATACAATG TAATCTTCTC CACCTGCAGC TACAAACATT TGAATATCTT 3840 

GTACAGGCAT AGAAGAATTT TTAACTCCAA TCACACGAGG ATTTTGACGC ATTGTTGCAT 3900 

ACAAACTACC AGTCAACGCA ACCCCTGCCA ATTGTGGAAT ATTATAGATA ATAAAATCTG 3960 

TATTTGACGC AGCTTCACTC ATTGCATTCC AATATGCTGC GATTGAATAC TCTGGCAATT 4020 

TGAAATAAAT AGGTGGGATA GCTGCAATAG CATCGACTCC AACACTTTCT GAATGTTTTG 4080 

CCAATTCGAT ACTATCTTTC GTGTTATTAC ATGCAATATG GTTGATAACT GTTAATTTAC 4140 

CTTTAGCAAC TTCCATAACA GCTTCAATAA TTTGTTTACG ATCTTCTACA CTTTGGTAAA 4 200 

TACATTCACC TGAAGAACCA TTTACATAGA TACCTTTTAC ACCTTTGTCA ATGAAATATT 4260 

GTACCAGAGA TTTTACACGA TCTTGGCTAA TTTCACCATT TTCATCATAG CAAGCATAAA 4 320 

ATGCAGGGAT AACGCCTTTG TATTTAGTTA AATCTTTCAT CAGATTTCTC CTTTATATTG 4380 

TTTTTTATTT GATGACATTA ATAAATCGCT GAGCAATTTC TTTTGGACGT GTAATCGCTC 4 440 
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CACCAATGAC TACACTGGTA ACACCTAAAC TATAACCTTT TTTTAATTGT TCTGGATAAT 4 500 

GAATTTTTCt TCGGCAATTA CCGGAATATT AAAATCAGCC AATTTTTTCA TTAGTTCAAA 4560 

ATCAGGCTCA TCTGATTGTA CACTTGTACT TGTGTAACCT GATAATGTTG TACCAACAAA 4620 

ATCAACGCCT GATTTAAATG CATAGAGACC TTCATCTAAA TTACTTACAT CCGCCATCAG 4 680 

CAATTGATTC GGATATTTTT CTTTTATTTT TTTGATAAAT TCACTGACAA CTAAGCCATC 4 74 0 

ATATCTTGGT CTTAAAGTTG CATCAAATGC AATGACTGTT GTTCCGCATT CTACAAGTTC 4 800 

ATCTACTTCT TTCATCGTAG CAGTAATATA TGGTTCTTGA GGTGGATAAT CCCTTTTGAT 4860 

AATTCCAATT ATTGGTAAAT CTACTACTTT CTGAATTGCT TTAATATCAC GCACAGAATT 4 920 

TGCGCGAATG CCCACTGCTC CTGCCTCTAA AGCTGCTTTA GCCATAAAAG GCATCAAGCT 4 980 

AAATTCTTCA TTATAAAGGG CTTCACCAGG TAAAGCTTGA CAAGAAACAA TGACTCCACC 504 0 

TTGAACTTGG CTTATAAATT TTTCTTTAGT CCAAATTTGG CTCATTTTAT TATTCCTCCT 5100 

TATGGATAAT AGTTTGATTG TAATAATATT GTCTCTCTGG ACTTTCCAGA TAATTAGAGA 5160 

ATAAGCAGTC TGTAATTAAA AGTATTGGAA ACTGAGGTGA TATGCGATTG CCATACGAGA 522 0 

GATGATCGGT CGAAGCTAAT AACAATAGTT CATCAAAGAA ACAATCTTCT TCGTCAAATT 5280 

TTCTTGTAGT CATTAAAACT GTTTTAGCGC CTTTATCTGC AGCTTTTTGT AGACCTTCTA 5340 

GTACAATATC AGTTTGACCT GAAATGGATG CTCCAATGAC AAGGCAATTT TCATTAAGTA 5400 

GTAACCTACT CCACAAAATC ATATCCTCGT CTGATAATAC TTCACCAATC ACTCCGAGAC 5460 

GCATAAATCT CATCTTCATT TCTTGTAAAG CAAGAACAGA ACTTCCTTTA CCGTAGAGAT 5520 

ATACACGCTC AGCAGTTTCT ATCATCTCAG CAATACCCTC AAGTTGAACT TCATCAAGAA 5580 

CCGTGTAAGT TTTTCTCAAC ATTTCCTCAT AGTCGGATAA AACTTTTTCT GTTGCCTCTG 5640 

TATATAATGC CAACTTTTCT TTCTCATGAA TCATCTCTTG GTATTTGAAA ATGAATTGTC 5700 

TAAAACCTTT AAAACCACAT TTTTTCGCAA ATCGAGTCAA TGTTGCTTTC GAT AC ATT AA 5760 

GGTATTCGCA CAATGCTTTA GATGAATAAT CATTCAGAGG TTGCTGTTTT AAGAAGAATT 5820 

TAGCAATGTC TTTTTCAGCA TATGCCATAT TTGGTAAGTT AGCTTCTATC ATTGGAATTA 5880 

GTTCTTTTTG CAGTAACATA TGAGCTCCTT AGTTGAAGTA AACGTTTACA TTCTTTATTT 5940 

TAACACTTTT TTTTTTTTTC AATATTTTTC ATAAATTAGA AACTAGTTTC CAATTTCTTT 6000 

CGTTTCATAA CAGAACAACA AACATAAAAA TATAATAGTT TTTATTCTTT TTATCGTAAT 6060 

TATATGTATT GTAAGAACGT TTATCACTAA TAATATGTTC ATATTAAAAT ATTTTAGTAA 6X20 

TATTTTATTT TGGTTTTATT ATTTCTTTTC GGAATTTCTA TATAATATTT TATTTCTAAA 6180 
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AAAATTGAAA 


AAATATTTCT 


AGTTTCTTTA TTTTATATAG GTAAT^TATT 


TTATTTCTAA 


6240 


ATTAAAAGAG 


AATCCCATAA 


AAACTACAGA 


TTTATGACAT 


AAATCAGGTC 


ACCTATTTTA 


6300 


AAAAAGCAGC 


AAACTATAAA 


CTAAAAAGTT 


CCACACCAAA 


TGTAACCCCA 


TACTTCCCCA 


6360 


TAAGTCAGAT 


TTATAGCGCA 


CCATACCTAA 


AAACATTCCA 


AGTGAAACGT 


ACAGACACCA 


6420 


AGCTAGAATG 


GTTCCTGGAT 


GATGTACTAA 


GGCAAATAAA 


ACACTTGTCA 


AAGCAACTCG 


6480 


AATATCTAAT 


TTTCTAACCA 


AGTTCCATAA 


AATTTCACGA 


TACAGAAATT 


CTTCAACCAT 


6540 


ACTCGCATTG 


ATTAAGAACA 


ATAAAAATCA 


AAACCAAGGA 


ACTTGATGTT 


GAAGGCCAAT 


6600 


TAAATTTGTT 


TGATTCGTGC 


TTCCTTGAGC 


ATGAATCAGG 


CTAAAACATA 


CACTTATAAT 


6660 


CAGTAGACTA 


GCTAGTCCAA 


TACCAAGGCA 


TTTCATCCTA 


GTTTTCATAT 


TGACCTTGAC 


6720 


CACTTGTTTT 


CGTTGACCAT 


ACATCCATAA 


AAAAGAAAAA 


AGAGACGCAC 


CATAGAGAAC 


6780 


CTGTAGTATA 


GTTAACTCAC 


CGATACAAAG 


AAATTTCAAT 


AAGTATAGAG 


ATACCAATAG 


6840 


GACATTTACT 


TCTTGGAATA 


TATAAACTGG 


AATTATTCTT 


TTCATAGTTA 


CCTCCGAAAT 


69O0 


AAATCTTCAT 


AATCTAAATC 


TAATATCTGC 


ACAATCCTTT 


CTACCCATGG 


ACTTTGAGGC 


6960 


ATTCGTTGTT 


CCATCTTGTA 


GTGGCGAATC 


TTTTGATATA 


AACGATTCAA 


TTCACTTGGA 


7020 


TAGTGAAACT 


CTCCCGCAAA 


CATTTTTCTG 


GTTAACTCAA 


TCCAGCTGAT 


ATTTCTTTCA 


7080 


GCCAAAATAA 


TGGACAAGTT 


CTCCCAAAAT 


CGTTCAGCCA 


TATTrCTTCT 


CCTTTAGTTA 


7140 


GATAAATAAT 


GTGTTTGyGC 


CATGTAAATC 


AATTGTTTCG 


TATCTCTTGG 


CAATAGAGCT 


7200 


CTAGCCTCTT 


CCAAATTCAG 


ACTTGCATAA 


ACCCGCTTAT 


TTGAAACCAC 


AAAAGGAACT 


7260 


CCGATGGTTA 


GTTCAGGATT 


TTTTAAAATT 


ATCTCAACCA 


AATCCGTTAA 


TCTTAGATTG 


7320 


TCACGGTTCT 


TAAATCGTAA 


TAAATTGGGA 


GATAAAAACT 


CAAAACAATC 


TGAAGAATAG 


7380 


CTCATCATCT 


CAATTAATTT 


GTCCTTTGTC 


ATTTCAGAAA 


CTGAATGACA 


AGATACCTCA 


7440 


ATGCCATAGT 


TTTGGAAGAA 


GTCTAAAAGA 


AGTTGATTTC 


TTTGGCTATT 


TTTACTTACA 


7500 


TAGAGATCAA 


TCATGGGAGA 


CCTCCAACAA 


ATTTGCTTCC 


ATTTGATATT 


CTGAGACGAT 


7560 


TAAGGAATCT 


AACAACTTTG 


AGAAGTTAAT 


CGATTTCTTG 


TCTTCATCAT 


AAGCTTTTAC 


7620 


AGTTACTTGG 


GTTGTAAGTA 


TCCCCTCTTT 


TCCCTCGGCT 


CGATAGTCTT 


GTCAATATAA 


7680 


AACAAAAACA 


AGATTCTGAT 


TATCATCTAC 


AAAGGCATTA 


ACTCCGTTCT 


TTATATCCTG 


7740 


ACTTTCAAGG 


AATTCCATAA 


CGTTTTGAAG 


ATAGGATTCA 


TAAAATAGTG 


GGTAATTATG 


7800 


TTTTTTATGG 


TAATCATCTA 


AAAATCTTAC 


CTCAAACTCA 


CATGGATAAT 


TGGGCATCAA 


7860 


AAATATTTGT 


TCATCCAGCT 


GTTTGATTTC 


TGCATCATCT 


AATTCTGTTT 


CTAATTCATC 


7920 


ACAATCTAGT 


ATTGATTCTT 


TATTTAATGC 


TTTTATCTTT 


TTCCTCTATT 


TCTTTTAATT 


7960 
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TCTTTGCGAT TGCGGCAATC ACAGGAACGC TTACACTATT ACCAACTTGT TTATAGAGCT 8040 

GACTATTAAT AGAGACTTTT CTAGCAGCTT CAAAAGCCTA ATCAGGAAAG CCATGCAATC 8100 

GAAAACACTC TTTAGGAGTG ATTCGTCGTA TTCTCAAACG GTAAAATTGT CCATCTATTA 8160 

AAACACCAGC TACTTGGTAA ACTTGTTTAT CTTCTCCTTC ATAGCTAGCC ACTACTACTC 8220 

CCATTTGACC ACTAGTTGTT AACGTATTAG CTATACCTTT TCCAACTCTA CCACGACGAT 8280 

ACTGAGAACT TGGTCTTTCT AAATTGATTG AATCCCCAAT CTCTGCTTGA GCATATCCTT 8340 

TTTTCCTTGC TTCCCGTACT TTTAGAAATT GGATTGGTTC TGGAATTAGT ATTTTGGGGA 84 00 

TTTTATCTCC TCCTTGCATC GTAGTCAGTG TTGGAGATAA GCCCTCACTT CCATAGACAC 84 60 

GACCTGTCTC CTTAAAGCTA GTCGGTAAAT CTCCAACAAC GACAATGCCA TAACGATCCT 8520 

GAGTATTTAA AGTAAACATC GGCTCTTGAT TTTCCTTAAA GCGTCTCCCA TTTTGTCTCT 8580 

TGTCTAATCT ATCTGGTGTC ATACAAGGAA TCGCAACTTT AAATCCTTCT CCTTTACCAC 8640 

GAACTAAGGT TGGCGCAAGA CCTTCTGAAT AATAGACTTT ACCGCTCATT CCACTTCTTG 8700 

ATGGATTCAA ATTTCCTAGT GCTTTCAAAG TCTCAGAGTT AGTTGCTTGA CCTTCTCCTC 8760 

TGAAAGGAAA TAAGAGTCTG GTACCTTTCT TTCTAGAATG TCCGATAATA AACACCCTCT 8820 

CTCTGTTTTT GGGAACGCCA AAATCCTTAC TCTTAACCAC CTGCCACTCA ACATCAAACC 8880 

CCAACTCATC AAGTGTGGTA AGTATTGTGG TGAACGTCCG TCCCTTATCG TGATTGAGTA 8940 

GGCCTTTAAC ATTTTCAAGA AAAAGAAAAC GTGGTTGGAT TTGTTTCGCC GCCCGAGCAA 9000 

TTTCAAAGAA CAAAGTTCCT CTAGTATCTT CAAATCCCAA TCGTCTTCCT GCGATTGAAA 9060 

ATGCTTGACA AGGGAATCCC CCACAGATGA CATCGACTTT CCCTCTAAGT TTTTTAAATT 9120 

CGTCATCTGA AACATCTCGT ATGTCATGAA ATTCTATTTC TCCTTCCGTT TGAAAAATGG 9180 

ACTTATAAGA TTTCCTAGCA AATTTATCAA TCTCACAAAA TCCCAAGCAC TCATGCCCTT 9240 

GAGCTTCCAT TCCCATCCTA AAGCCTCCTA TCCCAGCAAA TAAATCTAAA ACCCAAATCA 9300 

TTCATACCTC TCTCAACTAG ATGTAACTTA CAAAACCCCT GACCTCATGA GCCACTTTCT 93 60 

TCCTCCTCAT GAGGTCAGTT TTACTTTCTG CTGTTCCAGT ATCG'l'TTTTC CTCGCTAGAT 9420 

TTCCTCAAAA GGGCAGACTC CTCCCTTGGT TCGTCACACG ATTTTTTCAT CTCGACTGTT 9480 

CTTTAATGCA TCATTAACGA CGCTTTTCTT CTAGGTGGTT CATAAGGAAC AGGAAGATTC 9540 

AGCTTGACTT TTCTAATCCT AGAATAAAGT GCTGAAAACA ATTCGGAATA GGCATAGAGA 9600 

CTAGACAATT TGAGGAGCTG CTTGCGTCCT GTTCGAACAC ATTTTCCTAC CACGTGAAGA 9660 

AAAAGATGGC GGAAGCGTTT GATTGTTAAA GTTTGGAAGT CACCTCCAGC TAGATCTTTG 9720 
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ACAAAAAGAT AGAGATTGTA GGCGATACAG CTCATCATCA TACGAACTCG TTTTTGATTA 9780 

AGGTTGAACT ATCCGTTTTA TCGCCAAAAA ATCCCTCCTT CATCTCCTTG ATGAAATTCT 9840 

CGGCTTGACC ACGTCCACGA TAAAGCTGAA ACTGGTCTTG GCTTGTTCCG GTACCGA 9897 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8148 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Ui> SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CCGTGGAACA AGCCAAGACC AGTTTCAGCT TTATCGTGGA CGTGGTCAAG CCGAGAATTT 60 

CATCAAGGAG ATGAAGGAGG GATTTTTTGG CGATAAAACG GATAGTTCAA CCTTAATCAA 120 

AAACGAAGTT CGTATGATGA TGAGCTGTAT CGCCTACAAT CTCTATCTTT TTCTCAAACA 180 

TCTAGCTGGA GGTGACTTCC AAACTTTAAC AATCAAACGC TTCCGCCATC TTTTTCTTCA 240 

CGTGGTAGGA AAATGTGTTC GAACAGGACG CAAGCAGCTC CTCAAATTGT CTAGTCTCTA 300 

TGCCTATTCC GAATTGTTTT CACCACTTTA TTCTAGGATT AGAAAAGTCA ACCTGAATCT 3 60 

TCCTGTTCCT TATGAACCAC CTAGAAGAAA AGCGTCGTTA ATGATGCATT AAAGAACAGT 4 20 

CGAGATGAAA AAATCGTGTG ACGAACCAAG GGAGGAGTCT GCCCTTTTGA GGAAATCTAG 4 80 

CGAGGAAAAA CGATACTGGA ACAGCAGAAA GTAAAACTGA CCTCATGAGG AGGAAGAAAG 540 

TGGCTCATGA GGTCAGGGGT TTTGTAAGTT ACATCTAGTT GAGAGAGGTA TGAATGATTT 600 

GGGTAAATAC AATGAGCTTG AAAGAAGTAG CAAACTCACC AAGCGCCAAT TCTTTGAGAA 660 

TCAGATGCTG GATTATACCA TCATTGCGCA TGAGAGTTTT GAAATCATCC GTCATTCTGT 720 

CTACCAGACA GATGATCGTG AAGTGGAAAA TGCTCTGGCT TTTGAAGTGA AAAATGATGA 7 80 

AACAGACAAG CTGATTCTGT TATTAAGCGA GGATATTGGT GTAGGTGAAA AATTGTGCCT 840 

CCTTGACGGA ACAAAAATGC GTGGAAAATG TTTAGTATAT GATAAAATAA ATGAGAGAAT 900 

GATTCGCTTG CAGTGCTAGA AATAGGCATT TTGAATAGTG AATATGTTAT AATAAGTATT 960 

AGTAGGAGGT GTTTTAGATT GGAGAAGAAA CTGACCATAA AAGACATTGC GGAAATGGCT 1020 

CAGACCTCGA AAACAACCGT GTCATTTTAC CTAAACGGGA AATATGAAAA AATGTCCCAA 1080 

GAGACACGTG AAAAGATTGA AAAAGTTATT CATGAAACAA ATTACAAACC GAGCATTGTT 1140 

GCGCGTAGCT TAAACTCCAA ACGAACAAAA TTAATCGGTG TTTTGATTGG TGATATTACC 1200 

AACAGTTTCT CAAACCAAAT TGTTAAGGGA ATTGAGGATA TCGCCAGCCA GAATGGCTAC 12 60 
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CAGGTAATGA TAGGAAATAG TAATTACAGC CAAGAGAGTG AGGACCGGTA TATTGAAAGC 1320 

ATGCTTCTCT TGGGAGTAGA CGGCTTTATT ATTCAGCCGA CCTCTAATTT CCGAAAATAT 13 BO 

TCTCGTATCA TCGATGAGAA AAAGAAGAAA ATGGTCTTTT TTGATAGTCA GCTCTATGAA 1440 

CACCGGACTA GCTGGGTTAA AACCAATAAC TATGATGCCG TTTATGACAT GACCCAGTCC 1500 

TGTATCGAAA AAGGTTATGA ACATTTTCTC TTGATTACAG CGGATACGAG TCGTTTGAGT 1560 

ACTCGGATTG ACCGGGCAAG TGGTTTTGTG GATGCTTTAA CAGATGCTAA TATGCGTCAC 1620 

GCCAGTCTAA CCATTGAAGA TAAGCATACG AATTTGGAAC AAATTAAGGA ATTTTTACAA 1690 

AAAGAAATCG ATCCCGATGA AAAAACTCTG GTATTTATCC CTAACTGTTG GGCCCTACCT 1740 

CTAGTCTTTA CCGTTATCAA AGAGTTGAAT TATAACTTGC CACAAGTTGG GTTGATTGGT 1800 

TTTGACAATA CGGAGTGGAC TTGCTTTTCT TCTCCAAGTG TTTCGACGCT GGTTCAGCCC 1860 

TCCTTTGAGG AAGGACAACA GGCTACAAAG ATTTTGATTG ACCAGATTGA AGGTCGCAAT 1920 

CAAGAAGAAA GGCAACAAGT CTTGGATTGT AGTGTGAATT GGAAAGAGTC GACTTTCTAA 1980 

AATGAAGGAA AATGACTTGC AATCTCTGTT AAGAAATAAA ATAATCCCAC CTAGAACAAG 2040 

CTAGGTGGGA TTATTTGCCT ATG AAATG AG AAATTATGGG AGCAAGCTCC TAAATCAACT 2100 

GTTTTTGATC TACTTCTTTA ACTACTTGAT AAAAGTTATA GAAGTAGGCC AAACTTGAAA 2160 

TGATGGTTAC GACTAGGAAT ATTGAAAATT TCCATTGGAC AGGGTTGCTT AAAAGTTGTG 2220 

GAAAGGATAT GAGGAGAAAG AAGAGGGCTC CGTTGAGGAC AGGTATCCCT TTTGATTGTA 22 80 

TTTTCTCAAG TCCTTTATTG AGCGCAGCAA GAAAGAGGAG TAGGAGTAGT AAAACTGTAT 2 3 40 

GAGAAATAGC TCCTGAAGTA AGGGCGAAGA AAAGGAAAAT ACTGATAAAA ACATGAATGA 2400 

TCAGTAGTCT AGCTAGTGAT TTCATAAGGC ACCTCCTAAT CCTGGTCTTT TTTAGCTCTT 24 60 

GCAATACGAA GTGAGTCGAC AATATGTATC ATCACTCCGA AAAAGAAAGC TCCCAGTATA 2520 

GTTTTAAAAA TATGTTTTGT ATTTAGAAGA GAACTGATAA AATTTCGATT TTCACTTGTT 2580 

AGGGTATCAA TGAGTGGAAT TATAAAAAAT ATCACTGTTC CATAAATCGA ACCTCCTTTC 2640 

AGACCAGGAT AACGTAACTG TTTCTTTTCT TTTTTCATGA GTTTCCTCCT AATCCTCATC 2700 

TTGATTTTTC TTAGTTTTTG CAATGCGACG GGAGATGAGG AACTGTATGC TCGCTCCGAA 27 60 

GAAAATAGAA CCGAGAATAC TTGATACACC ATTTCTTATA GTGAGAAGAG AATGAAAATA 2820 

GTCCTGACCT TCATCTATGA GTATCCTGAG AAGAGGAGTT ATAAAAAACA TCCATAGACC 2880 

AAAGAACAAA CCTGCTTTCA GACCTGGGTA GTGTAGTTGC TTGCTTTCTT TCTCATTCAG 2940 

CATATCTGGT TCAATGACTG TGATGCCTGT TTTTTTCATT TGGTAGGTGA CATAGCCAGA 3000 
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AGCGATGAGG GCAATCACTA AAATCAGAGG AGGATAGATT AGAGCCACTT CTTGAGGGTA 3060 

TTTATAGGCC AGAAGCAGTG GAATAAGATT TCCGAAAATC ATCAGATAAA AGAGGATGAT 3120 

AAAGACTTGG TTCCCAATAC TATCGGCCTC ACGCCGTTTG TATTCGTCAA GGGGACCAGA 3180 

AATACCGTAT GTGCGTTTGA TCAGTTTTTC AGTGAAGGTT TCTTTTTTCA TGAGTTTGCT 3240 

CCTTTTTTAA AAATCTTCCT CCCAAAAGAG ACTGTTGAGG TCAGTTTGGA GGCTGCGGGC 3 500 

GAGATTGAGA CAGAGTTCCA AGGTTGGATT GTACTTGTCG TTTTCAATCA TATTGATAGT 3360 

CTCTCTCGAG ACACCGATAT CCTTGGCGAG TTCGAGCTGG GAAATACCCA ATTCCTTGCG 3420 

AAATTCTTTC ACACGATTCA TCTGTTCTCC TTTCTCATTT ATGTCGTATA TATTTGACTA 3480 

TATTATAGTC TTTTAAACAT AAAGTGTCAA GTATTTTTGA CATATTTTTT GAAGAAATAG 3540 

TAGTCTCCTT GTCCTATTTG TCTGACAAGT GCAAGCTGGT CGGATTTGTG GTAAAATAGA 3600 

TAAGATATGA CAAAAGAATT TCATCATGTA ACGGTCTTAC TCCACGAAAC GATTGATATG 3660 

CTTGACGTAA AGCCTGATGG TATCTACGTT GATCCGACTT TGGGCGGAGC AGGACATAGC 37 20 

GAGTATTTAT TAACTAAATT AAGTGAAAAA GGCCATCTCT ATGCCTTTGA CCAGGATCAG 3780 

AATGCCATTC ACAATGCGCA AAAACGCTTG GCACCTTACA TTGAGAAGGG AATGGTGACC 3840 

TTTATCAAGG ACAACTTCCG TCATTTACAG GCATGTTTGC GCGAAGCTGG TGTTCAGGAA 3900 

ATTGATGGAA TTTGTTATGA CTTGGGAGTG TCTAGTCCTC AATTAG ACC A GCGTGACCGT 3960 

GGTTTTTCTT ATAAAAAGGA TGCCCCACTG GACATGCGGA TGAATCAGGA TGCTAGCCTG 4020 

ACAGCCTATG AAGTGGTGAA CAATTATGAC TATCATGACT TGGTTCCTAT TTTCTTCAAC 4 080 

TATGGAGAGG ACAAATTCTC TAAACAGATT GCGCGTAAGA TTGAGCAAGC GCGTCAAGTG 4140 

AAGCCGATTG AGACAACGAC TGAGTTAGCA GAGATTATCA AGTTGGTCAA ACCTGCCAAG 4200 

GAACTCAAGA AGAAGGGGCA TCCTGCTAAG CAGATTTTCC AGGCTATTCG AATTGAAGTC 42 60 

AATGATGAAC TGGGAGCGGC AGATGAGTCC ATCCAGCAGG CTATGGATAT GTTGGCTCTG 43 20 

GATGGTAGAA TTTCAGTGAT TACCTTTCAT TCCTTAGAAG ACCGCTTGAC CAAGCAATVC 4380 

TTCAAGGAAG CTTCAACAGT TGAAGTTCCA AAAGGCTTGC CTTTCATCCC AGATGATCTC 4440 

AAGCCCAAGA TGGAATTGCT GTCCCGTAAG CCAATCTTGC CAAGTGCGGA AGAGTTAGAA 4500 

GCCAATAACC GCTCGCACTC AGCCAAGTTG CGCGTCGTCA GAAAAATTCA CAAGTAAGAG 4 560 

GGAAAAAGAT GGCAGAAAAA ATGGAAAAAA CAGGTCAAAT ACTACAGATG CAACTTAAAC 4620 

GGTTTTCGCG TGTGGAAAAA GCTTTTTACT TTTCCATTGC TGTAACCACT CTTATTGTAG 4680 

CCATTAGTAT TATTTTTATG CAGACCAAGC TCTTGCAAGT GCAGAATGAT TTGACAAAAA 4740 

TCAATGCGCA GATAGAGGAA AAGAAGACCG AATTGGACGA TGCCAAGCAA GAGGTCAATG 4800 
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AACTATTACG 


TGCAGAACGT 


TTGAAAGAAA 


TTGCCAATTC 


ACACGATTTG 


CAATTAAACA 


4860 


ATGAAAATAT 


TAGAATAGCG 


GAGTAAGATA 


TGAAGTGGAC 


AAAAAGAGTA 


ATCCGTTATG 


4920 


CGACCAAAAA 


TCGGAAATCG 


CCGGCTGAAA 


ACAGACGCAG 


AGTTGGAAAA 


AGTCTGAGTT 


4980 


TATTATCTGT 


cmvrrriT 


GCCATTTTTT 


TAGTCAATTT 


TGCGGTCATT 


ATTGGGACAG 


5040 


GCACTCGCTT 


TGGAACAGAT 


TTAGCGAAGG 


AAGCTAAGAA 


GGTTCATCAA 


ACCACCCGTA 


5100 


CAGTTCCTGC 


CAAACGTGGG 


ACTATTTATG 


ACCGAAATGG 


AGTCCCGATT 


GCTGAGGATG 


5160 


CAACCTCCTA 


TAATGTCTAT 


GCGGTCATTG 


ATGAGAACTA 


TAAGTCAGCA 


ACGGGTAAGA 


5220 


TTCTTTACGT 


AGAAAAAACA 


CAATTTAACA 


AGGTTGCAGA 


GGTCTTTCAT 


AAGTATCTGG 


5280 


ACATGGAAGA 


ATCCTATGTA 


AGAGAGCAAC 


TCTCGCAACC 


TAATCTCAAG 


CAAGTTTCCT 


5340 


TTGGAGCAAA 


GGGAAATGGG 


ATTACCTATG 


CCAATATGAT 


GTCTATCAAA 


AAAGAATTGG 


5400 


AAGCTGCAGA 


GGTCAAGGCG 


ATTGATTTTA 


CAACCAGTCC 


CAATCGTAGT 


TACCCAAACG 


5460 


GACAATTTGC 


TTCTAGTTTT 


ATCGGTCTAG 


CTCAGCTCCA 


TGAAAATGAA 


GATGGAAGCA 


5520 


AGAGCTTGCT 


GGGAACCTCT 


GGAATGGAGA 


GTTCCTTGAA 


CAGTATTCTT 


GCAGGGACAG 


S580 


ACGGCATTAT 


TACCTATGAA 


AAGGATCGTC 


TGGGTAATAT 


TGTACCCGGA 


ACAGAACAAG 


5640 


TTTCCCAACG 


AACGATGGAC 


GGTAAGGATG 


TTTATACAAC 


CATTTCCAGC 


CCCCTCCAGT 


5700 


CCTTTATGGA 


AACCCAGATG 


GATGCTTTTC 


AAGAGAAGGT 


AAAAGGAAAG 


TACATCACAG 


5760 


CGACTTTGGT 


CACTGCTAAA 


ACAGGGGAAA 


TTCTGGCAAC 


AACGCAACGA 


CCGACCTTTG 


5820 


ATGCAGATAC 


AAAAGAACGC 


ATTACAGAGG 


ACTTTGTTTG 


GCGTGATATC 


CTTTACCAAA 


5880 


GTAACTATGA 


GCCAGCTTCC 


ACTATGAAAG 


TGATGATGTT 


GGCTGCTGCT 


ATTGATAATA 


5940 


ATACCTTTCC 


AGGAGGAGAA 


GTCTTTAATA 


GTAGTGAGTT 


AAAAATTGCA 


GATGCCACGA 


6000 


TTCGAGATTG 


GGACGTTAAT 


GAAGGATTGA 


CTGGTGGCAG 


AACGATGACT 


TTTTCTCAAG 


6060 


GTTTTGCACA 


CTCAAGTAAC 


GTTGGGATGA 


CCCTCCTTGA 


GCAAAAGATG 


GGAG ATGCT A 


6120 


CCTGGCTTGA 


TTATCTTAAT 


CGTTTTAAAT 


TTGGAGTTCC 


GACCCGTTTC 


GGTTTGACGG 


6180 


ATGAGTATGC 


TGGTCAGCTT 


CCTGCGGATA 


ATATTGTCAA 


CATTGCGCAA 


AGCTCATTTG 


6240 


GACAAGGGAT 


TTCAGTGACC 


CAGACGCAAA 


TGATTCGTGC 


CTTTACAGCT 


ATTGCTAATG 


6300 


ACGGTGTCAT 


GCTGGAGCCT 


AAATTTATTA 


GTGCCATTTA 


TGATCCAAAT 


GATCAAACTG 


6360 


CTCGGAAATC 


TCAAAAAGAA 


ATTGTGGGAA ATCCTGTTTC TAAAGATGCA GCTAGTCTAA 


6420 


CTCGGACTAA 


CATGGTTTTG 


GTAGGGACGG 


ATCCGGTTTA 


TGGAACCATG 


TATAACCACA 


6480 


GCACAGGCAA 


GCCAACTGTA 


ACTGTTCCTG 


GGCAAAATGT 


AGCCCTCAAG 


TCTGGTACGG 


6540 
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CTCAGATTCC TGACGAGAAA AATGGTGGTT ATCTAGTCGG GTTAACCGAC TATATTTTCT 6600 

CGGCTGTATC GATGAGTCCG GCTGAAAATC CTGATTTTAT CTTGTATGTC ACGGTCCAAC 6660 

AACCTGAACA TTATTCAGGT ATTCAGTTGG GAGAATTTGC CAATCCTATC TTGGAGCGGG 6720 

CTTCAGCTAT GAAAGACTCT CTCAATCTTC AAACAACAGC TAAGGCTTTA GAGCAAGTAA 6780 

GTCAACAAAG TCCTTATCCT ATGCCTAGTG TCAAGGATAT TTCACCTGGT GATTTAGCAG 684 0 

AAGAATTGCG TCGCAATCTT GTACAACCCA TCGTTGTGGG AACAGGAACG AAGATTAAAA 6900 

ACAGTTCTGC TGAAGAAGGG AAGAATCTTG CCCCGAACCA GCAAGTCCTT ATCTTATCTG 6960 

ATAAAGCAGA GGAGGTTCCA GATATGTATG GTTGGACAAA GGAGACTGCT GAGACCCTTG 7020 

CTAAGTGGCT CAATATAGAA CTTGAATTTC AAGGTTCGGG CTCTACTGTG CAGAAGCAAG 7080 

ATGTTCGTGC TAACACAGCT ATCAAGGACA TTAAAAAAAT TACATTAACT TTAGGAGACT 7140 

AATATGTTTA TTTCCATCAG TGCTGGAATT GTGACATTTT TACTAACTTT AGTAGAAATT 7200 

CCGGCCTTTA TCCAATTTTA TAGAAAGGCG CAAATTACAG GCCAGCAGAT GCATGAGGAT 7260 

GTCAAACAGC ATCAGGCAAA AGCTGGGACT CCTACAATGG GAGGTTTGGT TTTCTTGATT 7320 

ACTTCTGTTT TGGTTGCTTT CTTTTTCGCC CTATTTAGTA CCCAATTCAG CAATAATGTG 7380 

GGAATGATTT TGTTCATCTT GGTCTTGTAT GGCTTGGTCG GATTTTTAGA TGACTTTCTC 7440 

AAGGTCTTTC GTAAAATCAA TGAGGGGCTT AATCCTAAGC AAAAATTAGC TCTTCAGCTT 7500 

CTAGCTGGAG TTATCTTCTA TCTTTTCTAT GAGCGCGGTG GCGATATCCT GTCTGTCTTT 7560 

GGTTATCCAG TTCATTTGGG ATTTTTCTAT ATTTTCTTCG CTCTTTTCTG GCTAGTCGGT 7620 

TTTTCAAACG CAGTAAACTT GACAGACGGT GTTGACGGTT TAGCTAGTAT TTCCGTTGTG 7680 

ATTAGTTTGT CTGCCTATGG AGTTATTGCC TATGTGCAAG GTCAGATGGA TATTCTTCTA 774 0 

GTGATTCTTG CCATGATTGG TGGTTTGCTC GGTTTCTTCA TCTTTAACCA TAAGCCTGCC 7800 

AAGGTCTTTA TGGGTGATGT GGGAAGTTTG GCCCTAGGTG GGATGCTGGC AGCTATCTCT 7860 

ATGGCTCTCC ACCAAGAATG GACTCTCTTG ATTATCGGAA TTGTGTATGT TTTTGAAACA 7920 

ACTTCTGTTA TGATGCAAGT CAGTTATTTC AAACTGACAG GTGGTAAACG TATTTTCCGT 7980 

ATCACGCCTG TACATCACCA TTTTGAGCTT GGGGGATTGT CTGGTAAAGG AAATCCTTGG 8040 

AGCGAGTGGA AGGTTGACTT CTTCTTTTGG GGAGTGGGAC TTCTAGCAAG TCTCCTGACC 8100 

CTAGCAATTT TATATTTGAT GTAAGAATGG CACCCTGATG TTTCAGGG 814 8 

(2) INFORMATION FOR SEQ ID NO: 12: 

{il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9909 base pairs 
IB) TYPE: nucleic acid 
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ICi STRANDEDNESS: double 
(D) TOPOLOGY: linear 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TACTCCACCC TTAATATCCG TTCCTGTAAA TACTTTACCG CTTTTAAGTT CATAGAATTG 60 

AACTTTTAAA TGCTTGTCTT CAAGCATCTT TTCCATCCAA TTTTTAGGAG TTTGACCAGC 120 

TTTAAATAAA AACCTTGCTG GGGTGATTAG TATAGATTTA TCTGCGATTT TATAAGCTTC 180 

ATCAATAAAA TAGTGATATA TCGGCTCATC TCTGGCTTCT CCTGTTTCCT GATACGGAGG 240 

ATTTCCTATC ACGACATCAA ATTTCATTTC ACTTTCCTCG CTAGATAGGC GCTCAAAACC 300 

TATCATTCTA TTCTTTTTCC AGTCTTTGAT ATGGGTTTTA GATTCTTCTA CTTCTTGGAC 360 

TTCTAGCTCA TCCGCAAACA AACTCAATTG TTGAGATTGC TTTTGTTTAG CTGAATAAGG 420 

ACTACTTTTT TTCAATCCAT CCATCTGAAA GACATTGTAA GAGATAATAG TCGCAATTTC 480 

TTTCTTTTGC TCTAATGTTG GTTGATTTCC AGTCTTAGCT AGATAATAGT CCTCAAAAGT 54 0 

TGCCAAAAGA TTCTCACGCG CCAAAAGGAG AGAATCTCCT TGATACTCAT AACCATACGA 600 

AGCATGATAA GCATCTTTTA CAAGTTTATA AAATGTGACT TCATCTGAAA CCTCACGACT 660 

AATCCGTTGC AGTTTTCTAT CAACAAAACC AACTCGCTCA GATAATGGAA TTTCCTCACC 720 

AGTTACGGTA TCATATCTCG TTACCATATA AGGTGCTTCA CCACAAGTTA CCTCTAACCA 780 

TCGTAAGTCC ACATACTCCT CAAGACTTAA CGAGCCTAAT TTCGATTCTA CATATCCATT 840 

TTGCTTTGCG ACCAACCACG TTGGTGTAAA CACTTCTGCC CTTATTTTTG TCCGATCTTT 900 

TTGTTCATAT TTGGATTTTT CAGATCTGGG CTGAATCAAG TTGGCAAAGT TTCCAGTAAC 960 

CTTACTTGGA TTGATGCGAT CACTTGGAGC AAATCCCTTT CCTAACAATT CATAAGAATG 1020 

CGTAnGCCAA ACAATTGATT TCTTTGTCGT TCGATCTTTT AAAAGAATTT TTAATAAGTC 1080 

AGCCGATTCT TTAGCCAAAC TTTCTTCACT AATATCTATT GTCATCAGCA ACCTCTCTTA 1140 

TATTGTAAGC CCTATTATAT CATATTTTAA AGAATGAAAA TTTACTTGAA AAAAGTAATT 1200 

CAATAAATAT CTCTCCGATG ACCAACTTCT AGAGTAGCAA CGACTAATTC ATCATCTACA 1260 

ATTTGTACGA TAACTCGATA ATTACCAATT CTATACCGCC ATTGACCAAC GCGATTACCA 1320 

ACCAAAGCCT TTCCGTGTCG TCTTGGGTCT TCCAAAACAT TGGTTTGTAA ATAGTTTGTA 1380 

ATTAGCTTCT GCGTATAACG GTCCAATTTT TTCAATTGCT TGATAAAACG TCT TG TT G CA 1440 

ACTAATTTAT ACAAATTATT CATCCTTCAA GCCTAAATCA TGCATCATTT CTTCCCAAGT 1500 

AATGGGTTCA ACTCCTTTTT CCAAGTCTTC TAAATACTCT TCATAGGCTA AATCTGCCAC 1560 
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ACGAGCATCG 


T ATT C A TCTT 


CTAGGGCTTC 


AAGAGTTTTG 


GTGCGAATAA 


GTTCCGAAAG 


1620 


GGAAACTCCT 


TCAAACTTAG 


CCATTGCTTT 


CATAAATGTT 


TTATCAGCTT 


CAGAAACTTT 


1680 


TAATGTAATA 


GTAGTCATCT 


TTTGTGCTCC 


CTTTTTTAAT 


GGTAACACCA 


TTGTATTACT 


1740 


1111 nuU i VJ 1 


TPAGTCAATA 


TAAAAAGAAC 


ACCTTCTCAG 


CGTTCTTTCT 


ATATCTCTGT 


1800 


V./VV 1 VjVj 4 V 1 1 


GCGGTATCTC 


GTGAGGTATC 


ATAAACCTTA 


AAGTCTACTC 


CGACTCCCAG 


I860 


r\ 1 Vrnwv. 1 1 VJn 




TGACCATGGT 


CATATGAGCC 


AGTTCCTTGA 


TATTGTTTTC 


1920 


L 1 1 r\\Jt\ I IW\ 


TGtTPAAGGT 


AAATfTTPTT 


AGTACGATTT 


CCTAGCGTCC 


GAATCATAGC 


1980 


1 1 LAuL ALLL* 


1 LL i Lv» 1 1 nu 


/vviuU 1 unV. 


AAGfVTf" AG AT 


A GG ATTCGTT 


GTTTGAGTCG 


2040 


r^C* n A/"?/*V"**P 11 ft 
L U\AuCu 1 


Ij/v\LL 1 Lj/\ 1 L 




1 /\ n 1 1 vjvj 


TTGGCCTCGA 


TAAGATAACC 


2100 


ATLlGLAT^ I 


rLGAL AA 1 IjV. 




f^Tf* arTC ftp a 


TAAPPTGTAT 


CTGTCAAGAG 


2160 


GAlAAAAlTL 


11 A I LA 1 LL 1 


1 LA 1 AnAuLu 


/\ 1 Avj/l/VL J LiL 


VjVj 1 vL\in>> 1 vj 


TATCATGCCT 
1 \« n * wvjv> k 


2220 

w w 4* V 


TACAllAAAA 


LTLTLGA i\j I 


LuAiAlL 1 UL 




GTTTT A CCC A 


TTTV^AA A A At* 


2280 


AiGLTl 1 IGL 


GAAGAA 1 LLA 


fpTTrrf a a/*; 




1 L a 1 v_ n x /%vj 


CTTGCCAGGT 


2340 


Cl"l™r I'C ATTG 


/"•/-■»ff»AAA/""A*T* 

GCATAAAGAT 


LLA I ALl-A I A 


LI 1 lAunuLL 


AAAArnrrTA 


P^TffATGGAT 


2400 


ATGATCTGAA 


TGLTLArGGG 


I AA X LAAuA 1 


cine iTrrirr 


1 L 1 1 L 1 uuL 1 


TAfGGTTAAT 

4 /%V^VJVJ & 4 t 


2460 

4* ^ \J V 


TTCAGCTAGC 


AGALTGG i AA 




AUALAAuLL 1 


ATf*Tft r*T A 
\«L A1L1 /\L 1 


A A A G f*n*f*TT 


2520 

A J 4v W 


'ITITGAGGTT 


TL V. AG A * AAA 


a A ft T V T V TV^ 
AAuAAl 1 ILL 


AflYKAAPrr 
AL IVlwwLLL 




T A fTGT ATTT 

1 r\V« * \A 4 ** 4 4 4 


25B0 


AAAGLCTATT 


TLALJlATTL 


1 ALi i v_ I I L 1 A 


L 1 ILA I LL i L 


PPATAPTTPT 

^ V»/% 1 r\\ 1 1 \_ 1 


Tf*TTT C* ACTG 

l V^ 4 4> 4 VO^ 4 vJ 


2640 

4* V ~ W 


(.Al LL rl A 1 L 


A i AAGGGAL* 1 


AL AA l\>u 1 AA 




CTTGCCGTAT 


TCACTCTTGG 

M *m» «A^- A V* *• 4 


2700 


#"*r*r*H & AT>A A A 

L L LAAATAAA 




lul 1 1 o/\ 1 A/i 




GATAGACAGT 


CCTAGACCTG 


2760 


1 ALL ALL 11 G 


luLALunLl 1 


r^p & fir* & & T 


r*P AP AfflAT A 


GAAACGGTCA 


AAGATACGTG 


2820 


Vj 1 AAA 1 LL 1\j 


LI 1 ALAjAA 1 L 




GHTPAGAAAT 


GCATAAAATC 


ATCTGGTCTT 


2880 


CAGTTGTCTT 


CATTCTGACA 


GTGATTTTAC 


CCCCATCTGG 


CGAATACTTA 


ATAGCATTAT 


2940 


TTAAAATATT 


GTCGACAACC 


TGCGTCATCT 


TATCTGTATC 


AATTTCCATC 


CAGATAGAAT 


1000 


TGATGGGATA 


ATCTCTCACC 


AACTCATATT 


TTTTCTCCTT 


TTCCTGTCCT 


TTCATCTTGT 


3060 


CAAAACGATT 


GAGGATAAAG 


GTAATAAAAG 


CAGTGAAGTT 


AATCAGTTCC 


ACATCTAGGT 


3120 


GACTGGTAGC 


ATTATCAATA 


CGTGAAAGAT 


GGAGGAGATC 


CGTCACCATG 


CGCATCATAC 


3180 


GGTTGGTCTC 


ATCAAGAGAA 


ACCTTGATAA 


AGTCTGGTGC 


TACAGTTTCA 


CACAAAGCCC 


3240 


CCTCATCCAA 


GGCTTCAAGA 


TAGGATTTTA 


CGCTAGTCAG 


AGGAGTCCGT 


AACTCATGGC 


3300 


TAACATTGGA 


AACAAAGAGT 


CTTCGTTCGC 


GTTCTTCCTT 


CTCCTGCTCC 


GTCGTATCAT 


3360 
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GCAAAACAGC 


CACCAAACCT 


GAAATAAAGC 


CAGACTCTCG 


ACCTA7CAAG 


GCAAAGCGAA 


3420 


CTCGAAGGTT 


CAAATATTCG 


CCATTGATAT 


CTTGGGAATC 


TAGCAACAAT 


TCTGGACTTT 


3480 


GGGTAATCAA 


ATCACGCAAT 


TCATAGTTTT 


CTTCTATCTT 


GAGCAATTCC 


AAAATGCTTC 


3540 


TATTCAGAAC 


ATCTTCCTTA 


ACCAACCCCA 


GTTGCTTCTT 


GGCTGTATCG 


TTAATCATGA 


3600 


TAATCTGACC 


CCGACGGTTA 


GTCGCAAGAA 


CCCCATCTGT 


CATATAAAAC 


AGAATACTAT 


3660 


TTAGCCTCTT 


ACTCTCTTGT 


TCTAGATTTT 


CCTGAGTGAG 


ACGAATAACC 


TCCGACAAGT 


3720 


CATTCAAATT 


ATTGGTAATA 


TTGGTGATTT 


CAGACCCACC 


TTGCATATCA 


AGAACCTTGG 


3780 


AATAATCTCC 


TGCAATCAAA 


TCTTTAACCT 


TTTGATTGAC 


TTGCTTCAAC 


TGAATATTAT 


3840 


CACGTCTATT 


TTCCAGTAAT 


AAGAGGCTCA 


CAACAAGGAT 


GAAACCTAAC 


AAAATCAGGA 


3900 


TAAAGATAAA 


ATCTCTGGTA 


AAAATGGTTT 


GTTTCAGTAA 


ATCAAGCATT 


ATTTCTCATG 


3960 


TAATACCCTA 


CACCACGCCG 


CGTCAAGATA 


TACTCTGGTC 


GGCTGGGCGT 


ATCTTCAATC 


4020 


TTCTCACGCA 


GACGTCGTAC 


AGTCACATCA 


ACTGTACGGA 


CATCACCAAA 


ATAGTCATAA 


4080 


CCCCAGACAG 


TCTCAAGCAA 


GTGTTCGCGC 


GTGATGACTT 


GACCTGTATG 


CGATGCTAAA 


4140 


TGATACAAAA 


GCTCAAATTC 


ACGATCGGTT 


AAGTCTAGTT 


CTTCGCCATA 


TTTTTTAGCC 


4200 


ACGTAGGCGT 


CTGGAACAAT 


TTCTAAATCC 


CCAATTTGGA 


TAGGTTGAGG 


TTTACTATCT 


4260 


GCTTCCTGAC 


CATCTACTGG 


CATAGGTTGA 


GAACGACGCA 


GAAGAGCTTT 


AACACGCGCC 


4320 


TGCAACTCAC 


GATTGGAGAA 


GGGTTTTCTT 


ACATAGTCAT 


CTGCCCCAAG 


TTCCAAACCG 


4380 


ATAACCTTAT 


CAAATTCACT 


ATCTTTGCCT 


GAAAGCATAA 


GAATGGGCAC 


ACTGCTTGTC 


4440 


TTACGAATGG 


TCTTAGCAAC 


TTCTAAACCA 


TCAATTTCTG 


GAAGCATCAA 


ATCCAGAATA 


4500 


ATAATATCTG 


GTTGCTCTGC 


TTCAAATTGC 


TCTAGCGCTT 


CACGACCATT 


AAAACCACTT 


4560 


ACAACTTCGT 


AACCTTCCTT 


GGTCATATTA 


AACTTGATAA 


TATCCGAGAT 


TGGTTTCTCA 


4620 


TCATCTACAA 


TTAGTATTTT 


TTTCATATGT 


TCACCTTTTT 


CTCTACTATT 


ATACCAAAAA 


4680 


AATAGTCAGA 


AGACACAATA 


GCTACTCTTG 


GCTACTCTCT 


AAGTTGGCTT 


GTGCATAAAC 


4740 


CTGCCAGATT 


TTTTGTTGGC 


GTTTGGCAAG 


TGGGTAATTC 


TTGAATTCTT 


CTGGTGAAAG 


4800 


CCAGCGAACT 


TCCCTATCTG 


AAAAATCATG 


GAAGTCACTC 


ACCTGACCTG 


CTACAATCTG 


4860 


TACATGCCAT 


TTTCGATGAC 


TAAAAACATG 


CTGGACTGTA 


TCAAAACAAA 


CATCAAGCCA 


4920 


ATCAACATCT 


AGGTCATAGT 


CCTGCTGGAA 


ACTCTCTTCT 


GGACTGGGAC 


CAAAGTTCAC 


4980 


ACTTTeTl'CC 


GCAACCTGAT 


GAAAGAGGTC 


AAACTGCTCT 


TCTTGCGAAA 


AGTTATCAAC 


5040 


TTCTATAAAG 


GGGAAATGCC 


AAAAACCTGC 


CAAGAGCTTT 


TCGCTTTCAT 


TTTTTTCAAG 


5100 
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TAAAAATTGT CCTTGAGAAT TTTTCACAAC TAAGGCTTTA AGATAAATAG GAACCGGCTT 5X60 

TTTCTTAGGA GATTTAATTG GATAACGGTC CATGGTTCCA TTCTGATATG CCGCACTAAA 5220 

GTCCTTGACT GGGCTTTCTT CAGGTCTGGG ATTTACAGGA GACTCAATAT CAGACCCTAA 5280 

GTCCATCAAG GCTTGATTAA AATCACCCGG ACGATCCGGA TTAATCAAGA TCTCCATCAT 5340 

TGCCTGAAAA ATTTTTCGAT TACTTGGAAT CCCAATATCG TGGTTGACTT CAAACAGACG 5400 

CGCCAAGACC CGCATGACAT TACCATCTAC AGCTGGCTCA GGCAAGTTAA AAGCAATACT 5460 

GGAAATGGCT CCTGCTGTGT AAGGTCCAAT CCCTTTCAAG CTGGAAATTC CTTCATAGGT 5520 

ATTTGGAAAT TGGCCACCAA AGTCAGTCAT AATCTGCTGC GCTGCAGCCT GCATATTGCG 5580 

AACTCGAGAA TAATAGCCCA ACCCCTCCCA AGCTTTCAGT AAACTCTCCT CAGGCGCAGT 5640 

TGCCAGACTT TCGACAGTTG GAAACCAGTC CAAAAATCTT TCGTAGTAAG GGATAACTGT 5700 

ATCCACCCTG GTCTGCTGAA GCATGATTTC AGATACCCAG ATGTGATAAG GATTTTTACT 5760 

TCTCCTCCAA GGCAAATCTC TTTTGTTTTC ATCATACCAA GCCAGAAGTT TCTCACGGAA 5820 

AGAAATGACT TTCTCCTCCG GCCACATGAC GATACCGTAT TCTTTCAAAT CTAACATATC 5880 

TCTAGTATAA CACAGAAGGT TTCACCTGTC TTTGTATCTG ATTTATAATA TTTTCAATAG 594 0 

ATAGTATATA ACTTTTCTAT CTACTTATAC TCAATGAAAA TCAAAGAGCA AACTAGGAAG 6000 

CTAGCCGCAG GTTGCTCAAA ACACTGTTTT GAGGTTGTGG ATAGAACTGA CAGAGTCAGT 6060 

ATCATATAcT ACGGCAAGGT GAAGCTGACG TAGTTTGAAG AGATTTTCGA AGAGTATAAA 6120 

TCTTATTGAT GAACTGCTTG CAGTCTGAGA AAAAATGAGC TTGGATATTA TTTCCAAACT 6180 

CACTTAAAGT CAATTTCAAT CCACTAGAAC AAGCCTAGTA CAGTTCCATC GCTTTCAACA 6240 

TCCATGTTGA GAGCTGCTGG ACGTTTTGGA AGACCTGGCA TGGTCATAAC ATCACCAGTT 6 300 

AAGGCAACGA TGAAGCCTGC ACCTAATTTT GGTACCAATT CACGAATGGT AATTTCAAAG 6360 

TTTTCTGGTG CTCCAAGCGC ATTTGGATTG TCTGAGAAAC TGTATTGAGT TTTAGCCATA 6420 

CAGATTGGCA ATTTGTCCCA ACCGTTTTGA ACGATTTGAG CAATTTGTGT TTGAGCTTTC 6480 

TTCTCAAAGT TCACTTTGCT ACCACGATAG ATTTCAGTGA CAATTTTTTC AATCTTTTCT 6540 

TGGACAGAAA GGTCATTATC ATACAAACGT TTATAGTTAC CTGGATTTTC AGCAATTGTC 6600 

TTAACAACTG TTTCGGCAAG TGCTACTCCA CCTTCTGCTC CATCAGCCCA GACACTAGCC 6660 

AATTCAACTG GTACATCGAT TGAGGCACAG AGTTCTTTTA AGGCTGCAAT TTCAGCTTCT 6720 

GTATCAGATA CAAATTCGTT AATAGCTACA ACTGCTGGAA TACCGAACTT ACCGATATTT 6780 

TCAACGTGGC GTTTCAAGTT AGCAAAACCT GCACGAACTG CCTCTACATT TTCTTCAGTC 6840 

AGAGCGTCTT TAGCCACACC ACCATTCATC TTAAGGGCAC GAAGGGTTGC GACAATAACA 6900 
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ACTGCATCTG 


GAGATGTTGG 


CAAGTTTGGT 


GTCTTGATAT 


CAAGGAATTT 


CTCAGCACCA 


6960 


AGGTCCGCAC 


CAAAACCAGC 


TTCAGTAACA 


GTGTAATCAG 


CCAAGTGAAG 


GGCTGTTGTC 


7020 


GTCGCCAAAA 


CAGAGTTACA 


GCCATGAGCG 


ATATTGGCAA 


ATGGACCACC 


GTGTACAAAG 


7080 


GCAGGTGTAC 


CGTAAATTGT 


CTGAACCAAG 


TTTGGCTTAA 


TAGCATCCTT 


CAAAATCAAA 


7140 


GCCAAGGCAC 


CCTCAACCTG 


CAAATCACCT 


ACAGAAACAG 


GCGTACGGTC 


ATAGCCATAA 


7200 


CCAATAACGA 


TATTCGCCAA 


ACGACGTTTC 


AACTCCTCGA 


TGTCCGTTGC 


CAAGCAAAGA 


7260 


ATTGCCATGA 


TTTCTGAAGC 


AACTGTAATA 


TCAAAACCAT 


CCTCACGTGG 


AATACCGTTT 


7320 


AGAGGACCAC 


CAAGACCAAC 


AGTCACATGG 


CGGAGCGTAC 


GGTCGTTCAA 


GTCCACAACG 


7380 


CGTTTCCAGA 


GGATACGACG 


TTGATCAATT 


CCCAGCTCAT 


TCCCTTCGTG 


CAAGTGGTTG 


7440 


TCAATCAAGG 


CAGAAAGGGC 


ATTGTTGGCA 


CTTGTAATAG 


CATGCATATC 


TCCAGTAAAG 


7500 


TGGAGGTTGA 


TGTCTTCCAT 


TGGCAGAACT 


TGTCCATACC 


CACCACCAGC 


AGCACCACCC 


7560 


TTGATCCCCA 


TGACTGGACC 


AAGAGACGGT 


TCGCGGATAG 


CAATCATGGT 


TTTCTTGCCA 


7 620* 


ATCTTGTTCA AGGCATCCGC 


AAGACCAATG 


GTAAGCGTCG 


ACTTTCCTTC 


ACCTGCAGGT 


7680 


GTTGGGTTGA 


TGGCAGTAAC 


CAAGATCAAT 


TTACCGACTG 


GATTGCTCTC 


AACTGCACGA 


7740 


ATTTTATCAA 


AGCTGAGTTT 


AGCCTTGTAC 


TTTCCGTACA 


ACTCCAAATC 


GTCATAAGAA 


7800 


ATACCAAGTT 


TCTCTACAAC 


ATCAACAATT 


GGCTTCAACT 


CAATACTCTG 


TGCGATTTCA 


7860 


ATATCTCTTT 


TCATTCAAAA 


TTCCTCTAAC 


CTCTTATATG 


ATAATTCATT 


ATATCACAAA 


7920 


ACAAGATTTT 


TAACATCCTA 


AAACTCTCTA 


AACGTTCGTA 


AATATCTCTC 


TTTTTAAGAC 


7980 


TTTTAGAGTC 


CTTTCTTAAA 


TTTTATATGC 


CTTTATAGTT 


TGAAACTATA 


ATAAATCTTC 


8040 


GTTTTTACCA 


AAAATTTATC 


ACTTTCATTT 


TACTTACCGC 


TTATTTTTGT 


GTACAATACT 


8100 


GCTATGAAAA 


TTTTAGTTAC 


ATCGGGCGGT 


ACCACTGAAG 


CTATCGATAG 


CGTCCGCTCT 


8160 


ATCACTAACC 


ATTCTACAGG 


TCACTTGGGG 


AAAATTATCA 


CAGAGACTTT 


GCTTTCTGCA 


8220 


GGGTATGAAG 


TTTGTTTAAT 


TACGACAAAA 


CGAGCTCTGA 


AGCCAGAGCC 


TCATCCTAAC 


8280 


CTAAGTATTC 


GAGAAATTAC 


CAATACCAAG 


GACCTTCTAA 


TAGAAATGCA 


AGAACGTGTT 


8340 


CAGGATTATC 


AGGTCTTGAT 


CCACTCAATG 


GCTGTTTCTG 


ACTACACTCC 


TGTTTATATG 


8400 


ACAGGGCTTG 


AGGAAGTTCA 


GGCTAGCTCC 


AATCTAAAAG 


AATTTTTAAG 


CAAGCAAAAT 


8460 


CATCAGGCCA 


AGATTTCTTC 


AACTGATGAG 


GTTCAGGTTT 


TGTTCCTTAA 


AAAGACACCC 


8520 


AAAATCATAT 


CCCTAGTCAA 


GGAATGGAAT 


CCTACTATTC 


ATCTGATTGG 


TTTCAAACTC 


8580 


CTGGTTGATG 


TTACCGAAGA 


TCATCTGGTT 


GACATTGCAC 


GAAAAAGTCT 


TATCAAGAAT 


8640 
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CAAGCAGATT TAATCATCGC GAATGACCTG ACTCAAATTT CAGCACATCA GCACCGAGCT 8700 

ATATTTGTTG AGAAAAATCA GCTTCAAACA GTCCAGACTA AAGAAGAAAT TGCAGAACTC 8760 

CTCCTTGAAA AAATTCAAGC CTATCATTCT TAGAAAGGAA AACTATGGCA AACATTCTCT 8820 

TGGCTGTAAC GGGTTCAATC GCCTCTTATA AGTCGGCAGA TTTAGTCAGT TCTCTAAAAA 8880 

AACAAGGCCA TCAAGTCACT GTCTTAATGA CTCAGGCTGC TACAGAGTTT ATCCAACCTT 8940 

TGACACTACA GGTACTCTCA CAGAATCCTG TCCACTTGGA TGTCATGAAG GAACCCTATC 9000 

CTGATCAGGT CAATCATATC GAACTTGGAA AAAAAGCAGA TTTATTTATC GTGGTACCTG 9060 

CAACTGCTAA CACTATTGCA AAACTAGCTC ACGGATTTGC GGACAACATG GTAACCAGTA 9120 

CAGCTCTAGC CCTACCAAGT CATATTCCCA AACTAATAGC TCCTGCTATG AATACAAAAA 9180 

TGTATGACCA TCCAGTAACT CAGAATAATC TGAAAACATT AGAAACTACG GCTATCAGCT 9240 

GATTGCTCCT AAGGAATCCC TACTAGCTTG TGGAGACCAC GGACGAGGAG CTTTAGCTGA 9300 

CCTCACAATT ATTTTAGAAA GAATAAAGGA AACTATCGAT GAAAAAACGC TCTAATATTG 93 60 

CACCCATTGC TATCTTTTTT GCTACCATGC TCGTGATACA CTTTCTGAGC TCACTTATCT 9420 

TTAACCTTTT TCCATTTCCA ATCAAACCGA CCATTGTTCA TATTCCTGTC ATTATTGCCA 9480 

GCATTATTTA TGGTCCACGA GTTGGGGTTA CACTTGGATT TTTGATGGGA TTACTTAGCT 9540 

TGACGGTTAA CACGATTACG ATTCTACCGA CAAGCTACCT CTTCTCTCCC TTCGTACCAA 9600 

ACGGAAACAT CTACTCAGCT ATCATTGCCA TCGTCCCACG TATTTTGATT GGTTTAACTC 9660 

CTTACTTAGT CTATAAACTG ATGAAAAACA AGACTGGTCT GATTTTAGCT GGAGCCCTTG 9720 

CTTCcTTGAC AAATACTATC TTTGTCCTTG GAGGAATCTT CTTCCTATTT GGAAATGTTT 9780 

ATAATGGAAA TATCCAACTT CTTCTGGCAA CCGTTATCTC AACAAATTCA ATTGCTGAAT 9840 

TGGTCATTTC TGCAATTCTA ACCCTAGCCA TTGTTCCACG ACTACAAACC TTGAAAAAAT 9900 

AAAAACAGG 9909 
(2> INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TAATTTTCAT ATAATAGTAA AATAGAATGT CTGATTCAAT AATCACCTCA AATAGAAAGG 60 

AAATTCTATG TCAAATCTAT CTGTTAATGC AATTCGTTTT CTAGGTATTG ACGCCATTAA 120 



WO 98/18931 



PCT/US97/19588 



231 



TAAAGCCAAC 


TCAGGTCATC 


CAGGTGTGGT 


TATGGGAGCG 


GCTCCGATGG 


CTTACAGCCT 


180 


CTTTACAAAA 


CAACTTCATA 


TCAATCCAGC 


TCAACCAAAC 


TGGATTAACC 


GCGACCGCTT 


240 


TATTCTTTCA 


GCAGGTCATG 


GTTCAATGCT 


CCTTTATGCT 


CTTCTTCACC 


TTTCTGGTTT 


300 


TGAAGATGTC 


AGCATGGATG 


AGATTAAGAG 


TTTCCGTCAA 


TGGGGTTCAA 


AAACACCAGG 


360 


TCACCCAGAA TTTGGTCATA CGGCAGGGAT TGATGCTACG 


ACAGGTCCTC 


TAGGGCAAGG 


420 


GATTTCAACT 


GCTACTGGTT 


TTGCCCAAGC 


AGAACGTTTC 


TTGGCAGCCA 


AATATAACCG 


480 


TGAAGGTTAC 


AATATCTTTG 


ACCACTATAC 


TTACGTTATC 


TGTGGAGACG 


GAGACTTGAT 


540 


GGAAGGTGTC 


TCAAGCGAGG 


CAGCTTCATA 


CGCAGGCTTG 


CAAAAACTTG 


ATAAGTTGGT 


600 


TGTTCTTTAT 


GATTCAAATG 


ATATCAACTT GGATGGTGAG 


ACAAAGGATT 


CCTTTACAGA 


660 


AAGTGTTCGT 


GACCGTTACA 


ATGCCTACGG 


TTGGCATACT 


GCCTTGGTTG 


AAAATGGAAC 


720 


AGACTTGGAA 


GCCATCCATG 


CTGCTATCGA 


AACAGCAAAA 


GCTTCAGGCA 


AGCCATCTTT 


780 


GATTGAAGTG 


AAGACGGTTA 


TTGGATACGG 


TTCTCCAAAC 


AAACAAGGAA 


CTAATGCTCT 


840 


ACACGGCGCC 


CCTCTTGGAG 


CAGATGAAAC 


TGCATCAACT 


CGTCAAGCCC 


TCGGTTGGGA 


900 


CTACGAACCA 


TTTGAAATTC 


CAGAACAAGT 


ATATGCTGAT 


TTCAAAGAAC 


ATGTTGCAGA 


960 


CCGTGGCGCA 


TCAGCTTATC 


AAGCTTGGAC 


TAAATTAGTT 


GCAGATTATA 


AAGAAGCTCA 


1020 


TCCAGAACTG 


GCTGCAGAAG 


TAGAAGCCAT 


CATCGACGGA 


CGTGATCCAG 


TCGAAGTGAC 


1080 


TCCAGCAGAC 


TTCCCAGCTT 


TAGAAAATGG 


TTTTtCTCAA 


GCAACT 




1126 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2520 base pairs 
IS) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCGGCAACAA AAAAGAAAAA ATCAACAGTT AAAAAAAATC TAGTCATCGT GGAGTCGCCT 60 

GCTAAGCCAA GACGATTGAA AAATATCTAG GCAGAAACTA CAAGGTTTTA GCCAGTGTCG 120 

GGCATATCCG TGATTTGAAG AAATCCAGTA TGTCCGTCGA TATTGAAAAT AATTATGAAC 180 

CGCAATATAT TAATATCCGA GGAAAAGGCC CTCTTATCAA TGACTTGAAA AAAGAAGCTA 240 

AAAAAGCTAA TAAAGTTTTT CTCGCGAGTG ACCCGGACCG TGAAGGAGAA GCGATTTCTT 300 

GGCATTTGGC CCATATTCTC AACTTGGATG AAAATGATGC CAACCGTGTG GTCTTCAATG 360 
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AAATCACCAA GGATGCAGTC AAAAATGCTT TTAAAGAACC TCGTAAGATC GATATGGACT 4 20 

TGGTCGATGC CCAACAAGCT CGTCGGATCT TGGATCGCTT GGTAGGGTAT TCGATTTCGC 480 

CTATTTTGTG GAAGAAGGTC AAGAAGGGCT TGTCAGCAGG TCGCGTTCAG TCCATTGCCC 540 

TTAAACTCAT CATTGACCGT GAAAATGAAA TCAATGCCTT CCAGCCAGAA GAATACTGGA 600 

CAGTTGATGC TGTCTTTAAA AAGGGAACCA AACAATTTCA TGCTTCCTTC TATGGAGTAC 660 

ATGGTAAAAA GATGAAACTG ACCAGCAATA ACGAAGTCAA GGAAGTCTTG TCTCGTCTGA 720 

CGAGTAAAGA CTTTTCAGTA GATCAGGTGG ATAAGAAAGA GCGCAAGCGC AATGCTCCTT 780 

TACCCTATAC CACTTCATCT ATGCAGATGG ATGCTGCCAA TAAAATCAAT TTCCGTACTC 840 

GAAAAACCAT GATGGTTGCC CAACAGCTCT ATGAAGGAAT TAATATCGGT TCTGGTGTTC 900 

AAGGTTTGAT TACCTATATG CGTACCGATT CGACTCGTAT CAGTCCTGTA CCGCAAAATG 960 

AGGCGGCAAG CTTCATTACG GATCGTTTTG GTAGCAAGTA TTCTAAGCAC GGTAGCAAGG 1020 

TCAAAAACGC ATCAGGTGCT CAGGATCCCC ATGAGGCTAT TCGTCCGTCA AGTGTCTTTA 1080 

ATACACCAGA AAGCATCGCT AAGTATCTGG ACAAGGATCA GCTTAAGCTA TATACCCTTA 1140 

TCTGGAATCG TTTTGTGGCT AGCCAGATGA CAGCGGCCGT TTTTGATACC ATGGCTGTTA 1200 

AATTGTCTCA AAAAGGGGTT CAATTTGCTG CCAATGGTAG TCAGGTTAAG TTTGATGGTT 1260 

ATCTTGCCAT TTATAATGAT TCTGACAAGA ATAAGATGTT ACCGGACATG GTTGTTGGAG 1320 

ATGTGGTCAA ACAGGTCAAT AGCAAACCAG AGCAACATTT CACCCAACCG CCTGCCCGTT 1380 

ATTCTGAAGC AACACTGATT AAAACCTTAG AGGAAAATCG GGTTGGACGT CCATCAACCT 1440 

ACGCGCCAAC CATTGAAACC ATTCAGAAAC GTTATTATGT TCGCCTGGCA GCCAAACCTT 1500 

TTGAACCGAC AGAGTTGGGA GAAATTGTCA ATAAGCTCAT CGTTGAATAT TTCCCAGATA 1560 

TCGTAAACGT GACCTTCACA GCTGAAATGG AAGGTAAACT GGATGATGTC GAAGTTGGAA 1620 

AAGAGCAGTG GCGACGGGTC ATTGATGCCT TTTACAAACC ATTCTCTAAA GAAGTTGCCA 1680 

AGGCTGAAGA AGAAATGGAA AAAATCCAGA TTAAGGATGA ACCAGCTGGA TTTGACTGTG 1740 

AAGTGTCTGG CAGTCCAATG GTCATTAAAC TTGGTCGTTT TGGTAAATTC TACGCTTGTA 1800 

GCAATTTCCC AGATTGCCGT CATACCCAAG CAATCGTGAA AGAGATTGGT GTTGAGTGTC 1860 

CAAGCTGTCA TCAGGGACAA ATTATTGAGC GAAAAACCAA GCGTAATCGC CTATTCTATG 1920 

GTTGCAATCG CTATCCAGAA TGTGAATTTA CCTCTTGGGA CAAGCCTGTT GGTCGTGACT 1980 

GTCCAAAATG TGGCAACTTC CTCATGGAGA AAAAAGTCCG TGGTGGTGGC AAGCAGGTTG 2040 

TTTGTAGCAA AGGCGACTAC GAGGAAGAAA AGATGGCTCT TTGTCAACTG TAGTGGGTTG 2100 

AAGTCAGCTA AGCTCGAGAA AGGACAAATT TTGTCCTTTC TTTTTTGATA TTCAGAGCGA 2160 
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TAAAAATCCG TTTTTTGAAG TTTTCAAAGT TCCGAAAACC AAAGGCATTG CGCTTGATAA 



2220 



GTTTGATGAG ATTATTGGTC GCTTCCAATT TGGCGTTAGA ATAGTGTAGT TGAAGGGCGT 



2280 



TGACGATTTT CTCTTTGTCC TTTAGAAAGG TTTTAAAGAC AGTCTGAAAA AGAGGATGAA 



2340 



CCTGCTTTAG ATTGTCCTCA ATGAGTCCGA AAAATTTCTC CGGTTCCTTA TTCTGAAAGT 



2400 



GAAACAGCAA GAGTTGATAG AGCTGATAGT GATGTTTCAA GTCTTGTGAA TAGCTCAAAA 



2460 



GCTTGTTTAA AATCTCTTTA TTGGTTAAAT* GCATACGAAA AGTAGGGCGA TAAAAATGTT 



2520 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10993 base pairs 
|B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TTTTCTCGAT AATAACTTCC ACCTTATTAT TTGGGATACC CTCCTCTTCT TCACCACCAC 60 

GTTCATAGTA GTCATCGCGA TAGAGAAAAG CTACGATATC AGCGTCCTGC TCAATAGACC 12 0 

CAGATTCACG AATATCAGAC AAGACCGGTC TCTTGTCCTG ACGTTGTTCT ACACCACGAG 180 

AAAGCTGACT CAGAGCGATT ACTGGAACCT TCAATTCCTT GGCTAGTATT TTCAACTGAC 240 

GAGAAATTTC AGAAACTTCT TGTTGACGAT TTTCTCGACC AGTTCCCGTG ATAAGTTGCA 300 

AATAGTCTAT CAAAATCAAA CCAAGATTTC CAGTTTCTTG AGCCAATTTA CGAGAACGAG 360 

AACGAATCTC TGTAATCCGA ATACCTGGCG TATCATCGAT ATAGATACTG GCGTTAGcTA 420 

GATTACCCTG AGCAATAGTA TATTTTTGCC ACTCCTCA7C TGTCAATTGC CCTGTACGGA 4 80 

TAGAATGTGA CTCCACTAAG CCTTCTGCAG CTAACATACG ATCTACCAAG CTTTCCGCAC 540 

CCATTTCGAG TGAAAAAATA GCAACCGTTT TGTCCAACTT AGTCCCAATG TTCTCAGCGA 600 

TATTCAAGGC AAATGCTGTC TTACCAACTG CTGGACGAGC TCCTAAGATA ATCAACTCCT 660 

CCTCATGAAG TCCTGTTGTC ATATGATCCA AATCACGATA ACCTGTCGCA ATACCTGTAA 720 

TATCGGTCGT TTGTTGCGAG CGAGCTTCCA GATTTCCAAA GTTGAGATTC AACACATCTC 780 

GAATGTTCTT AAACCCGCTT CGATTTGCAT TTTCACTGAC ATCAATCAAC CCTTTTTCTG 840 

CCTGAGCAAT AATTTCATCA GCTGGTTGTG ACGCTTCGTA AGCTTGGTTG ACAGACTCTG 900 

TCAACTTGGC AATTAAACGA CGTAGCATTG CTTTTTCTGC AACAATCTTA GCATAATACT 960 

CCGCATTAGC AGAAGTTGGC ACAGAATTAA CAATCTCAAC CAAGTAAGAC AAGCCACCAA 1020 
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TATTCTGTAA ATCACCTTGA TTATCAAGGA TAGTACGAAC CGTTGTTGCA TCTATGGCAT 1080 

CACCACGATC GGATAAATCG ACCATGGCTT GGAAAATCAA ACGATGGGCA TACTTAAAAA 1140 

AGTCCCGAGA CTCAATGTAT TCTCGCACAA AAACAAGTTT ACTCTCATCA ATAAAGATAG 1200 

CCCCTAAAAC CGATTGCTCA GCTAAGATAT CTTGAGGTTG TACTCGTAAC TCTTCTACTT 1260 

CTGCCATCAC ACTTCCCTTC CTTTTACAAT CTTGTCAAGA AGGTGTAAAC TTATCCTTCT 1320 

TTCACACCAA GAT TG ATT AC ACTTGTGATA TCTTGATAGA TTTTCACTGG CACATCAATC 1380 

AAACCAACCG CTCGAATCGG AGCTTGTACT TGAATATGAC GTTTATCAAT CTTAATTCCA 1440 

AATTGCTTTT GCAATTCTTC TGCAATCTTC TTATTGGTAA TAGAACCAAA GGTACGACCA 1500 

TCTGGACCAA CTTTTTCAAC AAATTCTACA ACAGTTTCTT CTGCTTCAAG TTGTGCTTTA 1560 

ATTGCTTTTC CTTCTGCAAT CATCTCAGCG TGAGCTTTTT CTTCCGATTT TTGTTTACCA 1620 

CGAAGTTCAC CTACAGCTTG AGCAGTCGCT TCTTTGGCTA GATTCTTTTT GATAAGAAAG 1680 

TTTTGCGCAT ACCCTGTTGG TACTTCCTTA ATTTCGCCTT TTTTACCTTT TCCTTTAACA 1740 

TCTGCTAAAA AGATTACTTT CATTCTTCTT TCTCCTTTTC CTTCATTTCA TTTAATACAA 1800 

TTTCTGTCAG TTTTTCACCT GCTTCTGACA AGGTTACATC TTTAATTTGA GCTGCTGCCA I860 

AATTAAAGTG GCCTCCACCG CCTAACTCTT CCATAATCCG TTGTACATTC AGTTTACTAC 1920 

GACTTCGAGC TGAGATAGAG ATAAATCCTT GTGTATTCTT CGCAAGAACA AAACTCCCTT 1980 

CAATACCTGA CATGGCTAAC ATGGCATCTG CTCCCTTACT AATAACAACT GTATCATAGC 2040 

ATTTCATGTC CTTAGCCTCT GCTATTAGTA CATCTGAACC TAATTTACGC CCCTCTAAAA 2100 

TAAGTTCATT GACCTCACGA TATTCTTCAA AATCTGTCGC AGCGATTTCC TGGATAGCAA 2160 

TACTATCACT TCCGCGCGTT CTGAGATAGC TAGCAACATC AAATGTCCGA CTAGTTACTC 2220 

GCGAGGTGAA ATTTTTAGTA TCCAACATCA TACCAGCCAT CAAGACACTT GCTTGCATAC 2280 

GACTCAAACG ATTTTTCTTA GAATTCTGGA ACTGAATCAA TTCCGTTACC AACTCACTGC 2340 

CACTACTTGC ACCACTTTCG ATATAAGTAA TAACCGCATT ATCTGGAAAA TCCTGATCCC 2400 

TTCTATGCTG GTCAATAACA ATGGTTTGGG TAAATAAATC ATAAAATTCT TTTGATAATG 24 60 

TTAAGGCTGT CTTTGAATGG TCTACAAGAA TCAACAAAGA ACGATTGGTC ACCATCCCCA 2520 

TTGCATCCTT AACAGACAAC AACTTCGTAA CTCCTTCTTT TTCTATGAAT GAAACAGCTC 2580 

GTTCAATATC TGGAGACATT TGTTCTTCAT CATAAAGAGC ATAGCTATTT TCAATCACAT 2640 

TGCTGGCGAA CAACTGCATA CCTACAGCAG AGCCCAAAGC ATCCATGTCT AAATTTTTGT 2700 

GACCGACTAC AAAAACCTCA TCTACACTCC GAATCTTATC TGAAATAGCT GTCATCATAG 2760 

CGCGCGTACG ACTCCGTGTA CGCTTGATTG AAGCAGCAGA CCCACCACCA AAATAAACTG 2820 
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GATTTTTCGT TTCGTCGTTT TCCTTAACAA CCACCTGGTC GCCACCACGT ACTTCAGCCA 2880 

AGTTCAAATT GAGCAAAGCA ACTTTCCCTA TCTCATCATG ATTTCCATCG CCATAAGAAA 2940 

ATCCCATACT TAAGGTCAAG GGCAACTGTC TCTGTTTCGA CTCTTCTCTG AAAGCATCAA 3000 

TAACAGAAAA TTTATCATTC ATCAAGCCCT CAAGCACCGT GTAGTCAGTA AATAGATAAA 3060 

ATCGATCCAT ACTTACCCGA CGAGAAAACA TCATCTGTTT TTCTGAAAAC TCTGATATAA 3120 

AATTAGCTAC AAAACTATTG ATTTGACTAA TATCTGACTC AGAAGTTTCA TCCTCCAAAT 3180 

CATCATAATT ATC C AC AG AG ACAATCCCAA TCACTGGTCT ACTTGTTACC AATTCATCTG 3240 

TTATGCCTTG TTCCCTGGAT ACATCTACAA AATACAAAAC ACCGGAAGAA GCATCCATAT 3300 

GAACAGCATA ACGCTTCTCA CCAAGCTTGG CATAAGTACA CGGATTTCCT ACTGAAGCCT 3360 

TGATAATCGT TTGAACAGCT TCTAAATCAA AATCACCATC TTCCTTGGTC AAAATCAATT 3420 

CAGCATAGGG ATTAAACCAC TCAACCTCTC CAGAAGATAA ATTCAATTTC ATAACACCTA 3480 

CAGGCATCTG TTCCAATAGA GCTGTCAAAC TTTCTTCCGC TTGGTGGTTT ACATACTGTA 3540 

TCTGTTCTAC ATCACTCCTT GTATAATGCA CTCTCAGTTT CTTAAATAAA AAAACATAGC 3600 

CTCCTACAAA AAGAAACAAA ATTAAAACCG TCAACAGATT ATTATTAACA AAAATAATGA 3660 

AAGTGGATAA GACTCCAAAC GCAATCAATC CTACTAGAAT AGGAAAAATT GGACTTACAT 3720 

AAAATTTTTT CATTCAAAAC CTCTTGGCAC CCATTATACC ATAATACCCC TCAAAAAGCC 3780 

ACTTTTTAAA AGTGTAATCA GTAATTCTAT CAATTATAAG AAAAAGGTAG TTTACAATTC 3840 

AGTAAACCTA CCTTTACACA TATTGAAATT AAGATTCTTT AACCTCTAAC AAACCAATTT 3900 

CGCCATCCTC ACGACGATAA ATCACATTGG TTGTCTGATC TTCAACATCC ACATAGATAA 3960 

AGAAATCATG CCCCAATAAA TCCATTTGTA GAATTGCTTC TTCCAAATCC ATTGGTTTTA 4020 

AATCAATTTG TTTTGAACGA ACAACTTTAG ACTGGACAAT ATTTGAATCT TCCACCAAAG 4080 

CATCTGTAAA TAATTGACCA GTTGCTACCT TATTTTTATT TTTACGCTCG ATTTTTGTTT 4140 

TATTTTTACG AATCTGACGT TCAATTTTAT CAGTTACAAG GTCAATTGAA CCATACATAT 4200 

CTTGAGATAC ATCTTCTGCG CGGAGAGTAA TAGATCCAAG CGGAATCGTT ACTTCCACTT 4260 

TAGCCGTTTT TTCACGATAA ACTTTTAAGT TAATTCGGGC ATCCAACTCT TGTTCTGGTT 4320 

GGAAGTACTT TTCGATCTTT TCGAGTTTAG AAACTACATA ATCACGAATT GCTTCTGTTA 4380 

CTTCTAGGTT TTCACCACGG ATACTATATT TAATCATATG AGTACCTTCT TTCTAAACAT 4440 

TTTTGTTTTT ATGATTTTAT TATAACGCTT TCATTCTATT TTTGCAAATT TTTTCCTCAT 4500 

CTTACAAGGG AAAATGTTTT TACATCCTTA GCACCAGCTT CTTCCAACAG TTTCTTAACA 4560 
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CGATTTATAG TTGCTCCTGT AGTATAGATA TCATCTATAA GTAGGATTTT TTTAGGAATA 4620 

GTGACTCCAC TTTTAATAAA GAAAGGAAGT TCTGTCCCCA AGCCCTCTGA ACGATTTTTA 4 680 

GAAGAACTGG CTCTCTCTTC TCTTTTCTCT AATAAATCCA GATACTCAAA GCCTGCTGCC 4740 

TCTACCAAGC CCTCAACCTG ATTAAATCCT CTATTAGCAT ATCTATCAGG ACTTAGGGGA 4800 

ATTACAACAA ATTGATACTC TTTGTACTTT TTCAACTCCT CACTTAAAAA TGAAGCGAAA 4860 

ACTTTTCTTA ACAGGAAGTC TCCATCAAAC TTATACCGAC TGAAAAAATC CTTCATAGCT 4920 

TGATTGTAAG TAAAAATCGC TCTATGACTG ACTTCAACTC CCTCTTTACA CCAAAGTTGA 4980 

CAATCTTGAC ACTTTGTTGA CAACTCTGTT TTCATACAAT TTGGACAGTT CTCTTCCCCA 5040 

ATTCTTTCAA AAGTAGAATC ACAGTCTGAA CAAAGACAAG AGTCATCATT CCTCAGAAGT 5100 

AAGAGACTAC TAAAAGTTAA AACACTCTTC ATAGTCTGCC CACATAACAA GCACTTCATA 5160 

GACCAGCCTC CTTATTCATC ATCTGAATTT CCTTAATCGC CTTCTTGATT GAAGCATTTA 5220 

ACCCATCATG GAAGAAAAGC AAATCTCCTG TCGGTCTATC CATGCTTCGT CCAACTCGTC 5280 

CACCAATCTG AATCAAACTA GACTTGGTAA ACAAACGATG ATTGGCCTCT ACTACGAAAA 5340 

CATCCACACA AGGGAAGGTA ACTCCGCGCT CCAAGATTGT CGTACTGATA AGTATTGTCA 5400 

GTTCTCCATC TCGAAAAGCT TGTACTTGCT CTAATCGATC CTCTGTTACA GAAGATACAA 54 60 

AGCCAATTTT CTCATTTGGA AATTGCTCCT GTAAGATTTC TGCTAACTGC TCCCCTTTCT 5520 

TAATTTCTGA AGCAAAAATG AGTAACGGAT AAGCTGTCTT TCTCTGCTTC TCAATATAGG 5580 

ACTTTAACTT TGGTGACAAA CGATTCTTGT CTAAGTAGCG ATTAAAATCC GATAACCAAA 5640 

TTGGTTTTGG AATAATCAAC GGATTTCCAT GAAACCGTCT CCGTAAATTC AGTCTTTTTA 5700 

GTTCTCCTAA ACGGACCTTT TTATCTAACT CATTGGTCGA AGTCGCTGTT AAAAAGATTC 57 60 

TCAATCCATT CTCCTTTACA CTATTCTTGA CAGCGTGGTA AAGCATGGGA TTATCAACAT 532 0 

AAGGAAAAGC ATCTACTTCA TCCACTATCA GCAAATCAAA AGCTTGATAA AACTTCAATA 5880 

ACTGATGGGT TGTTGCAACA ACTAGTCGTG TTCGAAAATA AGGTTCCGAT TCTCCATGTA 5940 

GCAAAGCTAT CCCGCAAGAA AAATCCTGTT GCAGGCGCTT GTACAGCTCC AAACAAACAT 6000 

CTATGCGAGG ACTAGCCAAA CACACTGCAC CACCCGCATT GATCACTTTA GCCACTACTT 6060 

GATAAATCAT TTCTGTCTTT CCAGCTCCTG TTACCGCATG AACTAAGGTT GGCTTTTGCT 6120 

TGTCTACTAC TTGAAGCAAT CCCTCTGACA CCTTCTCTTG AAAAGGAGTT AATTGGCCGC 6180 

GCCATTTGAG AACATCTTGC TTTGGAAAAT CCTCCTGCCG AAAATAGTAT AAAGTTTGAT 6240 

CACTTCTGAC TCGCTTCATC AGCAAGCACT CTCGACAATA GTAAGCACCG ATGGGCAAAT 6300 

ACCATTCTTC TAGAATAGTA CTATTACAGC GTTGACAGAA AAGTTTCCCC TTCTCCTTTC 6360 
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TCATTGCTCG 


AAGTTTCTCC 


GCCAACTGAC 


GTTCTTCTTC 


TGTTAATTCA 


TTCTCAGTAA 


6420 


ATAAACGACC 


GAGATAATCT 


AAATTTACTT 


TCATACTTCT 


TTATTCGTAA 


AAACTAGCAC 


6480 


TTTAGATGAT 


TTTTTAGTAC 


AATTAAATCA 


TGGAATTTAG 


GACAATTAAA 


GAGGACGGTC 


6540 


AAGTCCAAGA 


AGAAATCAAA 


AAATCTCGCT 


TTATCTGCCA 


TGCCAAGCGT 


GTTTATAGCG 


6600 


AAGAAGAGGC 


TCGTGACTTC 


ATTACTGCCA 


TCAAAAAACA 


ACACTACAAA 


GCGACACATA 


6660 


ACTGCTCTGC 


CTTCATTATT 


GGAGAACGTA 


GTGAAATTAA 


ACGTACAAGT 


GATGATGGTG 


6720 


AGCCTAGTGG 


TACTGCTGGT 


GTTCCCATGC 


TTGGGGTACT 


AGAAAATCAC 


AATCTCACCA 


6780 


ATGTCTGTGT 


GGTCGTGACA 


CGCTACTTTG 


GTGGTATTAA 


ACTAGGCGCT 


GGAGGACTAA 


6840 


TTCGTGCTTA 


CGCCGGCAGT 


GTCGCCTTAG 


CTGTCAAAGA 


AATTGGTATT 


ATTGAAATAA 


6900 


AAGAAGAGGC 


TGGCATTGCT 


ATTCAAATGT 


CTTATGCTCA 


GTACCAAGAG 


TACAGTAACT 


6960 


TCCTTAAAGA 


ACATGGTCTC 


ATGGAGCTGG 


ATACAAACTT 


TACAGATCAA 


GTCGATACGA 


7020 


TGATTTATGT 


TGATAAAGAA 


GAAAAAGAAA 


CTATTAAAGC 


TGCACTTGTG 


GAGTTTTTTA 


7080 


ATGGAAAAGT 


CACTTTAACT 


GACCAAGGTT 


TACGAGAGGT 


TGAACTTCCT 


GTAAACTTAC 


7140 


TGTAAACAAT 


GAATAATACA 


GCGTTTCGTT 


GACATTCTCA 


CAACTACTTT 


AGCGAGCAAA 


7200 


ATAAAAAGAG 


GCGTACCAAA 


ATATACTAGA 


AAATGAAGCA 


ATTCAAACGA 


AACCTGATAT 


7260 


CGTTTTCCTT 


CACACCTATT 


TACTAGAATT 


AGCTGAACGC 


AATCACTTGA 


AAATTAATGA 


7320 


CTTTGATCTA 


TGATATATAG 


AAATGGTATG 


GATAGCGTTA 


TACTAAAGAT 


ATCTTATACA 


7380 


AAGAGGTATT 


CATATGTCTA 


TTTATAACAA 


CATTACTGAA 


TTAATCGGTC 


AAACACCGAT 


7440 


TGTTAAACTT 


AACAACATCG 


TGCCAGAAGG 


TGCTGCAGAC 


GTCTATATAA 


AGCTTGAAGC 


7500 


ATTTAATCCT 


GGTTCATCTG 


TAAAAGACCG 


TATTCCCCTT 


AGCATGATTG 


AAAAAGCTCA 


7560 


ACAAGATGGT 


ATTCTGAAAC 


CTGGTTCTAC 


TATTGTTGAA 


GCAACAAGTG 


GAAACACCGG 


7620 


TATTGGACTT 


TCATGGGTAG 


GTGCTGCTAA 


AGGGTATAAA 


GTCGTCATCG 


TTATCCCTCA 


7680 


AACTATGAGT 


GTAGAACGAC 


GTAAAATTAT 


CCAAGCTTAT 


GCTGCTGAAC 


TCGTCCTAAC 


7740 


TCCTGGTAGC 


GAGGGAATGA 


AAGGTGCTAT 


TGCTAAGGCT 


CAAGAAATCG 


CTGCTGAACG 


7800 


TGATGGTTTC 


CTTCCTCTTC 


AATTTGACAA 


TCCAGCTAAT 


CCAGAAGTAC 


ACGAAAGAAC 


7860 


AACAGGAGCT 


GAGATACTAG 


CTGCTTTCGG 


TAAAGATGGA 


TTAGATGCCT 


TTGTTGCTGG 


7920 


AGTAGGTACT 


GGTGGAACGA 


TTTCTGGTGT 


TTCTCATGCA 


CTCAAATCAG 


AAAATTCTAA 


7980 


CATTCAAGTT 


TTTGCAGTAG 


AAGCAGATGA 


ATCTGCTATT 


CTATCTGGTG 


AAAAACCTGG 


S040 


TCCTCACAAA 


ATTCAAGGTA 


TCTCAGCTGG 


ATTTATTCCT 


GATACACTTG 


ATACTAAAGC 


8100 
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CTATGATGGT 


ATCGTTCGTG 


TAACATCAGA 


TGACGCTCTT 


GCACTCGGAC 


GTGAAATTGG 


8160 


TGGAAAAGAA 


GGCTTCCTTG 


TAGGGATTTC 


CTCAGCTGCA 


GCTATCTACG 


GAGCCATCGA 


8220 


GGTTGCCAAA 


AAATTAGGTA 


CAGGTAAAAA 


AGTCCTTGCC 


CTAGCACCAG 


ATAACGGTGA 


8280 


ACGTTATCTC 


TCTACAGCAC 


TTTATGAATT 


GTAACCGTCC 


AATAACGAAG 


TCTATTGAAA 


8340 


AATCTCCAGA 


CTAGAGAACT 


CACGGATAGT 


TCCTAATCTG 


GAGATTTCTT 


ATTTGCACTT 


8400 


TTCTTGTACA 


ACTTTAGTCC 


ATGGTAAATA 


GGCCTCTAAA 


ACCTCTTTGT 


TTACGAGAGT 


8460 


TTCCACGTTT 


GGAAGACATT 


CTAGAAGATA 


GGATAGATAT 


TTCTCACTAT 


TTATAATGGA 


8S20 


TTGAAATAAG 


ATATGAACAA 


ATCGATTAGA 


ACATGATGGT 


AAAGCGTAAT 


CCCTTGTTTC 


8S80 


TCAGCTTTCC 


CAGACAAAAA 


AGTCCAATAG 


TAAGTCAGCT 


GACTATCACT 


CTCTAGCACC 


8640 


CTATAAGAAG 


TTTCATCCGC 


ATGAAGTAAG 


GGCTGAGTCA 


ATAGTCTCTC 


TCGCAAGAGG 


8700 


TTATAAAGGG 


GCTCCAAATA 


GTATTGACTC 


GTCTTGATAT 


GCCAATTAGA 


GATTTCCTTA 


8760 


CGTGTGATTG 


GTAAACCCAT 


CCTAGCCCAA 


TCTTCTTCTT 


GGCGATAATT 


GGGTACCTTC 


8320 


AGATTAAACT 


TCTGATGGAT 


GGTGTGAGCC 


ATAATAGAAG 


CTGAGCCAAA 


GTTATGCGCT 


B880 


AAAGGGGCTT 


TAGGAATAGG 


AGCTTTCACA 


AGCTTATCCA 


GATGATTATC 


TTTTACTCGT 


8940 


TATGGACAAT 


GCTATATGGC 


ATAAATCAAG 


TACCTTAAAG 


ATTCCGACTA 


ATATTGGCTT 


9000 


TGCATTTATT 


CCTCCATACA 


CACCAGAGAT 


GAACCCCATT 


GAACAAGTGT 


GGAAAGAGAT 


9060 


TCGTAAACGT 


GGATTTAAGA 


ATAAAGCCTT 


TCGAACTTTG 


GAAGATGTCA 


TACAAGGACT 


9120 


GGAGAAGGAG 


GTGATAAAGT 


CCATCGTTAA 


TCGGAGACGG 


ACTAGAATGC 


TTTTTGAAAA 


9180 


CAGATGAGTA 


TAAAAAGAAA 


GTCCTCATTT 


CAATAGAAAT 


CACGACTTTC 


TGATGAATTT 


9240 


ATAGTAAAAT 


GAAATAAGAA 


CAGGATAGTC 


AAATCGATTT 


CTAACAATGT 


TTTAGAAGCA 


9300 


GAGGTGTACT 


ATTCTAGTTT 


AAATCCACTA 


TATTTGGGGA 


GTGATAGAAA 


AGCCCTTCAT 


9360 


CAGCCAATCT 


ACTTGTTCAG 


GTGCGAGAGC 


TTTGACATCC 


TTTTCTGTAC 


TGGACCAAGT 


9420 


(.Au'l'l'TTLCb 


11 t. I LAAAuL 


L» I I 1 AIaIaA 


TATPrA A1AT 


rmT &CC AT 


CCCAGTAAAG 


94 BO 


AACTTTAAAG 


CGGTCTTTAC 


GTCCACCACA 


AAAGAGAAAG 


ACTTGATCCG 


AGAAAGGATC 


9540 


CAATTCAAAG 


TGGGTTTTAA 


CTACATAGGC 


TAATGAGTCT 


ATTCCCTGCC 


TCATATCTGT 


9600 


CTTGCCACAA 


ACAAGGTGAA 


CTTGACCTAA 


ATCACTTAGT 


TGAATTATCA 


TAGTACAATA 


9660 


CCTTTCCTCC 


GATAATTATT 


TTTTATCTGG 


TATACTGGAA 


GTTGGGGAAT 


TAGGATAGAT 


9720 


ACCTTGTTAT 


GACGCGCTTA 


CTATGAATTT 


GAAGTATAGT 


CTCCTAAATG 


CACTTAGCCC 


9780 


TTATTATAGG 


GCTTTTTGTT 


TTAATTATTC 


TAATCGAGTG 


AGACTGGGGA 


AAAAACAATT 


9840 


TCAGGAAAAA 


TCTAAGCCCT 


ATACAAAAAA 


GGAAGCAATT 


TGCTTCCTTT 


CTATTATTAG 


9900 
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TTATTCAAGG CTGCTGCCAT TGTAGCTGCA ACTTCAGCTT CGAAGTCGTT TGCAGCTTTC 9960 

TCGATACCTT CACCAACTTC AAAGCGAGCA AACTCAACTA CCGAAGCGTT AACTGATTCA 10020 

AGGTATGCTT CAACTGTCTT GCTGTCATCC ATGATGTAAA CTTGTGCAAG AAGTGTGTAA 10080 

GCTTGGTCAA CTTTAGTGTT ATCAAGCATG AAGCGATCCA TTTTACCTGG AATAATTTTG 10140 

TCCCAGATTT TTTCTGGTTT GCCTTCTGCA GCCAATTCAG CTTTGATGTC AGCTTCAGCT 10200 

TGAGCAATAA CATCATCAGT TAATTGAGCT TTTGATCCAT ACTTCAAGTG TGGAAGAGCT 10260 

GGTTTATTAA CCATTGCACG GCTTTCGTTG TCTTGGTCGA TAACGTGATT CAATTGTGCC 10320 

AACTCATCTT TAACGAATTG CTCATCCAAT TCTTTGTAAG AAAGAACTGT TGGTTTCATC 10380 

GCTGCGATGT GCATTGACAA TTGTTTAGCA AGTGCTTCGT CTCCACCTTC AACAACTGAA 10440 

ATAACACCGA TACGTCCACC GTTATGTTGG TATGCTCCAA AGTGTTGTGC GTCTGTTTTT 10500 

TCAATCAATG CAAAGCGACG GAATGAGATT TTCTCTCCGA TAGTTGCTGT TGCAGATACG 10560 

TATGCAGCTT CAAGAGTTTC ACCTGAAGGC ATTATCAAAG CAAGAGCTTC TTCGTTGTTA 10620 

GCAGGTTTTC CTTCAGCAAT GACTTTAGCT GTAGTATTTA CCAATTCAAC GAATTGAGCG 10680 

TTTTTTGCAA CGAAGTCAGT TTCACCGTTT ACTTCAATAA CTCCTGCAAC ATTACCGTTA 10740 

ACATAAACAC CAGTCAAACC TTCTGCAGCA ACACGGTCAG CTTTCTTAGC TGCCTTAGCC 10800 

ATACCTTTTT CACGAAGCAA TTCAATCGCT TTTTCGATGT CACCGTCTGT TTCTACAAGC 10860 

GCTTTTTTAG CGTCCATAAC ACCGGCACCA GATTTTTCAC GCAACTCTTT TACAAGTTTA 10920 

GCTGTAATTT CTGCCATTTT AATTCTCCTA TATTTTTTGA AAATAGCAGA GCGCGGCTAA 10980 

GCCCCGCCTC CGG 10993 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A> LENGTH : 8411 base pairs 
(B) TYPE: nucleic acid 
(C> 5TRANDEDNESS : double 
(D) TOPOLOGY: linear 



txi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGACGGGGAG GTTTGGCACC TCGATGTCGG CTCGTCCCAT CCTGGGGCTG TAGTCGGTCC 60 

CAAGGGTTGG GCTGTTCGCC CATTAAAGCG GCACGCGAGC TGGGTTCAGA ACGTCGTGAG 120 

ACAGTTCGGT CCCTATCCGT CGCGGGCGTA GGAAATTTGA GAGGATCTCC TCCTAGTACG 180 

AGAGGACCAG AGTGGACTTA CCGCTGGTGT ACCAGTTGTC TTCCCAAAGG CATCGCTGGG 240 



WO 98/18931 



PCT/1/S97/19588 



240 

TAGCTATGTA GGGAAGGGAT AAACGCTGAA AGCATCTAAG TGTGAAACCC ACCTCAAGAT 300 

GAGATTTCCC ATGATTATAT ATCAGTAAGA GCCCTGAGAG ATGATCAGGT AGATAGGTTA 360 

GAAGTGGAAG TGTGGCGACA CATGTAGCGG ACTAATACTA ATAGCTCGAG GACTTATCCA 420 

AAGTAACTGA GAATATGAAA GCGAACGGTT TTCTTAAATT GAATAGATAT TCAATTTTGA 480 

GTAGGTATTA CTCAGAGTTA AGTGACGATA GCCTAGGAGA TACACCTGTA CCCATGCCGA 540 

ACACAGAAGT TAAGCCCTAG AACGCCGGAA GTAGTTGGGG GTTGCCCCCT GTGAGATAGG 600 

GAAGTCGCTT AGCTTTAATC CGCCATAGCT CAGTTGGTAG TAGCGCATGA CTGTTAATCA 660 

TGATGTCGTA GGTTCGAGTC CTACTGGCGG AGTAATtGAT AAAAGGGaAC ACAGCTGTGT 720 

TCCTCTTTTT GTATCAATTT GTATCACCAA GCATTTTCAT AACGAAGTCT GTTATTTCTT 780 

GAGAACTTTC TTTTTTTCCA TGTGCAATCC AAGTTTGGCA GACACCAAAA AGTGCATGAG 840 

TTAGATAGAT GCTACTATAT TCTAATTCAG TGGTATTTAG ATTCAGTTGC ATAAATCGCT 900 

TTTGTAAATC TGTACTAAGC ATGATATGAA GTTTATTTCG TAAGAAATTT TGGATTTCTT 960 

TAGTCCCATT TTCAGAAAGA AGGGCAGCCA GAAGTGGTTC TGACTCTAGA TATTCAAAAA 1020 

CTTCTAAAAT AGCGTCTCTT TTGTGATGAG CATGTTTTTG AAAAATATAT TCAAATGTAT 1080 

GGAATAGCTT GCTTTGATAG TGCTCAATCA TATCATACTT ATCCTTATAG TGAGTATAGA 1140 

AGCTGGAACG ACTAATTCCG GCTTTTTCTA CTAATTTGAC AGTAGAAATT TTATCAAATG 1200 

GCTGTTCCAT CAGTAATTGT ACCATAGCAT TTTCAATAGT TCGCTTTGTT TTTAAGCGTT 1260 

TGTTACTTTC TTGCATATTT CCTCCTTCTA AACAAATTAG ACTATATGTC TAAAAATAGA 1320 

TTTTTTATCT TGTAATTTAG ATTTTTTAAT GTATAATCTA TTATATCAAA ATTTTAGACA 1380 

ATATGTTTAA AAAAGG.\CAA ACTAAGTTTA AAGAAT7GAA AGCAATTTAA AAAAAACCAA 1440 

CCTTTATTAT TGTCATGATC GGGATTTCTC TTATTCCAGA TCTGTACAAT ATCATATTTT 1500 

TGTCATCAAT GTGGGATCCA TATGGGCAAT TGTCTGACTT ACCTGTGGCA GTTGTAAATA 1560 

ATGATAAAGA GGCTTCCTAT AATGGTAATA CTATGGCAAT AGGAAAAGAC ATGGTGTCCA 1620 

ATTTAAAAGA AAATAAAACC TTGGATTTTC ATTTTGTAGA TGAAGAGGAA GGAAAGAAGG 1680 

GATTGGAAGA TGGCGATTAC TATATGGTAG TGACTTTACC AAGTGATTTA TCTGAAAAAA 1740 

CAACTACATT ATCCAATATT CAATCGACAG CAGCTTATCA ATCATTGACA AGTGAGCAAC 1800 

AAACTGAGAT AAGTGATTCT GTATCTCAAA ATTCAACTGA TAGTATTCAA TCGGCTCAGT 1860 

CAATTGTAGC TTTAGTACAA GATTTACAGG GAAGTTTAGA AAACTTACAA AATCAATCTT 1920 

CTAATCTTTC GACTTTAAAA AATCAATCTA ATCAAGTATC ACCTATTACT TCTACTTCTT 1980 

TGATAGGATT GTCAAGTGGA TTAACAGAGA TACAAGGAGA TGTTACTAGC AAATTAGTTC 2040 
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CTGCCAGTCA GTCGATTGCA TCAGGTGTAA ACGCATATAC TACAGGTGTT GATAAAGTTT 2100 

CTCAGGGCGC AAGTCAACTA AGTGAAAAAA ATGCCACCTT CACAGCTAGT TTGGATAAAC 2160 

TAGTTTCAGG CTCAAACACC TTGACACAAA AATCTTCTAG ATTGACAGCA GGAGTTGGTT 2220 

AATTACAATC AGGATCTGGG CAATTAGCAG ACAAATCCAG TCAGTTACTT TCAGGTGCTT 2280 

CTCCATTAGA GAATAGAGCT AATAAATTGG CAGATGGATC TGGGAAACTA CCAGAACGTG 234 0 

GAACAAAGTT- AACTTCTGGA TTGGAAGATT TACAGACAGG ACTTGCTTCT TTAGGACAAG 2400 

CACTAGGTAA TGCTAGTGAT CAACTCAAAT CAGTATCAAC AGAATCTAAA AATGCAGAGA 2460 

TTTTGTCAAA TCCACTCAAT CTTTCAAAAA CAGACAATGA TCAAGTTCCT GTAAATGGAA 2520 

TCGCAATAGC TCCTTATATG ATATCAGTTG CTCTTTTTTT GCAGCAATAT CAACAAATAT 2580 

GATATTTGCG AAATTGCCTT CAGGACGTCA TCCAGAGAGC CGTTGGGCTT GGTTGAAATC 2640 

TTCAGCTGAA ATAAATGGTA TTATAGCTGT TTTGGCAGGA ATTTTGGTAT ATGGAGGAGT 2700 

TCAGCTTATT GGTTTAACTG CTAATCATGA GATGAGAATA TTTATTCTCA TCATCCTAAC 2760 

AAGTTTAGTA TTCATGTCTA TCGTGACCAC TTTAGCAACG TGGAATAGCC GTATAGGAGC 2820 

TTTTTTCTCA CTTATTTTGC TTTTACTACA GTTAGCATCA AGTGCAGGTA CTTATCCACT 2880 

TGCTTTGACA AATGATTTCT TTAGATCTAT TAATCCCTGG TTACCAATGA GCTATTCAGT 2940 

TTCGGGATTA CGACAAACAA TCTCTATCAA CAAGTCATTT TCCTAGCTGT CATACTAGTT 3000 

CTATTTACTA GTTTAGCTAT GCTAGCCTAT CAACATAAGA AAATGGAAGA AGATTAAAAA 3060 

AATCGACCGA TTAACTGGTC GATTTTTTAT GCCTTAGATG ACTTTCGTCT GTGATTATAG 3120 

ATTCCAAATA GTAAGAGACA ACTAAAGGAA CAGATTGCTC CAGTAATAAA ACCATTGGCA 3190 



TGAGCTTGTT TAATTTCTAT TTTCTTACCA TCTTGGTAGG CAGACCAACC TTTGTCATAA 3300 

GGAATGGTGA AGAAAATAGA TGTATCTTGT TGGACATCAT ATGTAGCAAA AACCTTGTTT 3360 

TTAGAAGTTG ATACTGTGAC AGGTTGTTCT TTAATTTTTT GAATTCCCTC GGTGAAAGTT 34 20 

TTGGTATCTA AACGATAGAA GGTAGGAGAT TCAAATGATA CTTGTGAATT TCCAGGGAAA 3480 

CTAACATTGA TATTGAAAGT TTTTTTCTCT TTAGTATATC CT AG ATT AAA GAAGGAGAAG 3540 

ACATTATCAG TTGTAAAAGT CTTTTTTTCA CCATTTACAA GGATGTCAAC CTTCTTTTGT 3600 

TTATCGTTAG AAAAGTGAAG GTTTATGAAA GAGAGATAAA CTTGGCTGTT TTCTGGAACT 3660 

TCAATTTGAT ACTGGATTGC TGCATCTTCA TTTGAAGAAC TTGTGACACT AATCAAATCA 3720 

TTAGTATTTT CTATTTTTTC TGTTTTTTCA TAAGCTATTG GAGAAAAATA ATCAAAATTG 3780 
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ACGTTAGCAA GTTGATTTAA AAATGAGGCC TGATTATCCA AGGXAXpTTC ATTGAACTTG 3840 

ACATCATTGT AAACAGATTG ACTCGCAACT GCAATCGGAA GAGAGTATTG ATTTTCATAT 3900 

AGGGTAAGAT TATCTTTTTG ATAGATATCT TTAAAGCCAT ACTTATCAAT AGGACTGTCT 3960 

GAGATATTGT ACTGGATACC AAATAAACTA TCAGCCAAAA TACTATTATT TGCATATCCG 4020 

AGATTGAGAT TAGTCCCAGA GGATTTAAAA CCAAGTTTAT CTAAAGTAGA GCTTGATGAA 4080 

CGATTTCGAA CAGATGAAAA TTGAGACATT CCATTGTAGT TGAATTTCAT ACTGTCATTT 4140 

CCTGTCTGAG TTTGTAGTTT TTCAGTACGA GTAAATTGAT TTCCAATATA TGTTGAGAAA 4 200 

GATTC CAT AG CTGGGATATC TCGACTATAA GCACTTCGAG AAGCAAATCC CCATTCCTTA 4260 

GCAATTCCGT CCATTTGAGA TGAAGCATTT AAACTCATTT CAACCAGTAT AAATAAAGAG 4320 

ATTAGAATGG CAAATAGATT CACAGATATA AACTTTTTGA TAACTGCAAG GAGTAAAAGA 4380 

GAATAGACAA CCAAAAATTC AAGAGTAAGC AGAATATTCA AATCTGTTAA AAAAGAATAA 4440 

TGCGATTTTA GATAGATGGT AGCTAAAAAT CCTCCTACTA CAAGAAAAAG CGAAACTAAA 4500 

AAATTCCAGA CTTTAAGTTC TTTCAGACGC TTTAAGACTT CTGCTGCTGT GTAAATTAAC 4 560 

AAGGTAGAGA AAATCCAAGC ATAGCGATGT AAAAACATGT TTGGAGTATG CATGCCTTCC 4 620 

CAAAATAAGT CAAGAGCTTC TATGTAAAAG CTTGCAATTA GAAATGCAAA GAATATTACA 4 680 

TATATGAGTT TCACGTGAAA CTTAATAGAT TTCAGCGTAA AAAATAAAAT GGTCAAAATA 4740 

AAGGGAAATA GTCCAACAAA AATCATTGGG ATGGCCCCAT ACTTTGTTGT GTCAAAGGAA 4800 

CCAATGAATT CCTTAGCAAA GAGATCAAGA TACCAGCTAC TTTCAGTTTC AAACTTTGTA 4 860 

ACTTCAGTCA ATTTTTCCCC ATCTCTCTGT AAATCAAATA GAGTGGGAAG AGTCATAATC 4920 

AAACTAGCCA TACCAGCTAA AAAGGAGATA ACTATOAAAT CAAGANCAGA TGATTTTCGA 4 9 80 

GTCTTAAAGT CCCACGAAAT TTGACAGAGA TACCAGAAAA TAAGAAACAA TACTGTCATA 5040 

TATCCAAAAT AATAATTTTG AATAAATAAG ATTGACAGAC TTGTAAAGTA CAATAGGAGT 5100 

TTCTTTTCAG TTATCAGTAG ATGTAAACCA GTTATAATTA AAGGAATCAA GATAAAAACA 5160 

TCTAGCCAGG TTTTTATCTC TAATTGACTG ACAGTGAAAC TCATCAGAGC ATAGGAAGTA 5220 

GATAAGGCTA GTTTTAAAAT CTGAGGGATA GATTGAAACA ATTTATTCAA ACTAAAAAAG 5280 

GTTGACAGAC CAATCAATCC AAATTTTAAG AGAGTTGTCA GATAGATAGC ATCTGGCATA 5340 

TTCGTTAGAT CAAAAAAGTA AACCAGAGGC GCGAGAAAAC TACCCAAGTA ATAACTAGAT 5400 

AGGGCATAGA ACTTTACCCC TAGACCACTT GTAAAGGTGT AAAACAGATT ACTATTTCCA 54 60 

TGTAGGATAT TTCGTAAGGC TACATCAAAA ATAACGTATT GATGAAAGCC ATCPCCTAAT 5520 

AGAGGAGAGT TGTCGCTATT CCAGTAGATA CTTTGAGATA GATATACTCC AGACATAATC 5580 
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ACTACAGGAA TGATGAAAGA AATAAAATAG GTTCGATATG TTTTTAAAAA TGATTTCATG 5640 

TTACCTCGTA GAATGATAGA AAACTCAGTT GGTTAACCCA ACTGAGTTTT GAAGTTTTAT 5700 

TTAGTCTTTC CAAAGTTCTT TAACTTTTGC TTGTACTTCT GCATTTTCTA GGAATTCATC 5760 

GTAGGTTTCA TCGATACGGT CAATGACGCC ATTTTTAGAT AAGACAATGA TATGGTTAGC 5820 

CAAAGTTTGA ATAAATTCGT GGTCATGGCT GGCAAAGATG ATTGATTCTT TAAAGTTTTT 5880 

CAATCCATCA TTCAAGCTTG AGATAGATTC CAAGTCCAAG TGATTTGTTG GATCATCAAG 5940 

TACAAGGACA TTTGATTTTA AGAGCATGAG TTTTGAAAGC ATGACACGAA CTTTTTCTCC 6000 

CCCTGACAAG ACATTTACAG GTTTGTTAAC TTCATCTCCA GAGAAGAGCA TACGGCCGAG 6060 

GAAGCCACGT AGGAAAGTAT TGTCATCTTC TTCTTTACTT GCGAATTGAC GCAACCAGTC 6120 

AAGAATTGAT TCTCCTCCTG CAAAATCAGC TGAGTTATCT TTTGGTAGGT AAGATTGACT 6180 

AGTTGTAACT CCCCACTTGA CAGTTCCTTC ATAGTCAATA TCTCCCATGA TTGCACGAAT 6240 

TAATGCAGTC GTTTGAATAT CATTTTGTCC AATAAGTGCT GTCTTATCAT CTGGACGCAA 6300 

GATGAAACTA ATATTATCCA AGATAGTTTC ACCATCAATC TTTACAGTTA AATTTTCTAC 6360 

TGTCAAGAGA TCATTACCAA TCTCACGTTC CGCTTTAAAG TTGATAAATG GATATTTACG 64 20 

ACTAGATGGC ACAATCTCTT CTAGCTCAAT CTTATCAAGC ATTCTCTTAC GTGATGTTGC 6480 

CTGCCTTGAC TTAGAAGCAT TGGCAGAGAA ACGAGCAACA AATTCTTGCA ATTGTTTAAT 6540 

TTTTTCTTCT GCTTTAGCAT TACGGTCTCC TAGCAATTTA GCAGCAAGCT CAGAAGATTC 6600 

CTTCCAGAAG TCGTAGTTTC CGACATAGAG TTTGATTTTT CCAAAGTCAA GGTCGGCCAT 6660 

GTGAGTACAA ACTTTGTTTA AGAAGTGACG GTCGTGGGAT ACTACGATAA CTGTGTTATC 6720 

AAAGTCAATC AAGAA J „".\ACCAAGr AATC TATTOO A . . ;;.V-w CS77AGTAGG o^iO 

CTCGTCCAAG AGAAGAACAT CTGGTTTACC AAAAAGTGCT TTGGCGAGGA CAACCTTTAC 6840 

TTTTTCACCG TTGGCCAATT CGCTCATGTT TTGGTAGTGT AATTCTTCTG GAATGTTTAG 6900 

GTTTTGAAGT AGTTGAGAGG CTTCACTCTC TCCTTCCCAA CCTCCAAGTT CGGCAAACTC 6960 

TCCTTCGAGT TCGGCAGCAC GAACCCCGTC CTCGTCTGAG AAATCTTCCT TCATGTAGAT 7020 

AGCATCTTTC TCTTTCATGA TGCTATAAAG TTTTTCATTT CCCATGATAA CGACATCAAT 7080 

GGCACGTTCA TCTTCGTAGT CAAAGTGATT TTGACGAAGA ACAGAGAGAC GTTCATCTGG 7140 

ACCAAGAGAG ATGTGACCAG TAGTAGGTTC GATATCTCCA GCTAAAATTT TTAAAAAGGT 7200 

TGATTTTCCG GCACCATTAG CACCGATTAA TCCGTAAGTA TTTCCTTCTG TAAATTTGAT 7260 

ATTGACATCA TCAAAAAGTT TGCGATCACT AAAACGTAGT GAAACATCAG ATACTGTAAG 7 320 
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CAATGTTTTT CTCCTATATG TGTAATATAT TTATTCTACT AGAAAATACA GAAATATTCA 7 380 

AATTTTTATT TGTCAATTTT GTGTAAATTA TATTTACAGT ATCCTTTACA CAAATCTGTA 7440 

AAAAGCAAGG CTGATTTATT TTGATAAATT ACGGTTATTT CATTAAAAAA ATGCTATAAT 7500 

TGAAAGGACT ATATCGAAGG AGAACAAAAT GACTAAACCC ATTATTTTAA CAGGAGACCG 7560 

TCCAACAGGA AAATTGCATA TTGGACATTA TCTTGGAAGT CTCAAAAATC GAGTATTATT 7 620 

ACAGGAAGAG GATAAGTATG ATATGTTTGT GTTCTTGGCT GACCAACAAG CCTTGACAGA 7680 

TCATGCCAAA GATCCTCAAA CCATTGTAGA GTCTATCGGA AATGTGGCTT TGGATTATCT 7740 

TGCAGTTGGA TTGGATCCAA ATAAGTCAAC TATTTTTATT CAAAGCCAGA TTCCAGAGTT 7800 

GGCTGAGTTG TCTATGTATT ATATGAATCT AGTTTCGTTA GCACGTTTGG AGCGAAATCC 7860 

AACAGTCAAG ACAGAGATTT CTCAGAAAGG ATTTGGAGAA AGCATTCCGA CAGGATTCTT 7920 

GGTCTATCCA ATCGCTCAAG CAGCTGATAT CACAGCTTTC AACGCTAATT ATGTTCCTGT 7980 

TGGGACAGAT CAGAAACCAA TGATTGAGCA AACTCGTGAA ATTGTTCGTT CTTTTAACAA 8040 

TCCATATAAC TGTGATGTCT TGGTAGAGCC GGAAGGTATT TATCCAGAAA ATGAGAGAGC 8100 

AGGGCGTTTG CCTGGTTTAG ATGGAAATGC TAAAATGTCT AAATCACTAA ATAATGGTAT 8160 

TTATTTAGCT GATGATGCGG ATACTTTGCG TAAAAAAGTA ATGAGTATGT ATACAGATCC 8220 

ACATCATATC CGCGTTGAGG ATCCAGGTAA GATTGAGGGA AATATGGTTT TCCATTATCT 8280 

AGATGTTTTT GGTCGTCCAG AAGATGCTCA AGAAATTGCT GATATGAAAG AACGTTATCA 8340 

ACGAGGTGGT CTTGGTGATG TGAAGACCAA GCGTTATCTA CTTGAAATAT TAGAACGTCA 84 00 

ACTGGCTCCG G 8411 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9064 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17; 

TGCCGTACTC AAGTACAGCC TGCGCTAAGT TTCCTAGTTT GCTCTTTGAT TTTCATTGAG 60 

TATTAGTAAC CAAAATCCGA CCACATAGCC AGCCCCTATG AATATAGCCA TTAAAGCTAG 120 

CATGGAATTT AGGAAATTAA AAACCACCGC AGATACAAAG GTTAGCACAA AAACATTAAA 180 

AGCAATGGTG TCAGAAGCCA AGACTAGAAT ATAGGGTGTC AACCGATCTA AAGTTTTGGA 240 

ATCTAGCAAA AATAAGTGTT TATACATGAT GACCTCCTCT ATGGCTGAAA AGCAAGCCTT 300 
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TTGTTTTTTT ACCCCAAGAC CCTATGTAGA AAAGTGAGCA AAAACGGGAA GGTCGCTACA 360 

ATATTATTGA TCACATGCAC CGCATAGGAT GGATAAATGC TCTTGCTATA GCGGGTCAAA 4 20 

CCAGCAAAGA TGATTCCAAC TGTTGCAAAG ACGAAGATAT CTAACAGACT AGGCAGGCTT 4 BO 

GAAAAATGAG GGAGAGCAAA TAAAATAGAA GGAAGAAGCA AATCAAGACC AAATCGCGAA 540 

TGCTTAAAGA AAGCATGTTG CAGTAATCCT CTATAAATCA ATTCTTCCAT CAGTGGAACC 600 

AGAAAGAACA GGGCTATATA AATACCTAGC TCTGCAAAGT TAGTCCCACT ATAACCAATC 660 

AATACAGCCC AACCTTCCGC AGTTGACTGA ACATGTTTAG CTGTCTGAAC GTTAAAAGAG 720 

ATCTGGAACA CTAGCACTAA TACTGTCAAA ATCGAATACC AAAGCCATTT TTTTCTTGGA 780 

ATGCGGAAGA GATAACCATG GCCTGTCTTA ACAAGAACCA CAATCATGAC TCCAATAAAA 84 0 

AGTAAACTCA AGATATTTTG AATCCAGAAT AAATTGCCTA TCTGAGAAGA AAATTGCCAA 900 

TAGTTTTGGA CGATAAGCGT CAGCTGAGAA AGACTAAATA CGAAAAATAA GTAAGAGAAG 960 

ACTGCACTTA TTTTGAATAG AAGTTGATAC TTTTTCATAG AAATCCTCCC TACTATGACC 1020 

TCACCTTGTC AGGCTCTACT GCTGTAAGAT TAAGAAGACA GTTTGTTTTT TTTAAGGCTA 1080 

ACCTGACTAC TAGATAATAG ATACATTAAG GCATTAAAGA CAATGAAAAT ATGTCCATAG 1140 

AATAAAATCA ACCTCGCATC CAAACCAAGA TAAAGTTTGA TTATCAAAAA GATGAGCAAA 1200 

AGAATTTGAA ACCATAAGGT TTTTCCAAAA ATAAATTTAA AGCGATTTCG AATATCTACT 1260 

TCCTTGATTT TTACCGCCAC CCCTTTATTA GCAAGAAGGA AAACTCCTGC TTCAAACAAA 1320 

CCACTGTAAA GAACAAGCCA CCCAATAGAT ACGATAGAGA TTTGTAAAAA TGTCCCTAAA 1380 

AGAATATCCA ACACACTACT CAAGAAAATA ACAAAAAATA ATCTGTATTT CATATTAAAT 1440 

ACCTCCATTC ATTTATTTCA CTAACAATTT AATAGAGCCT TCTACTCAAA TATCCTGTCA 1500 

GAAAAGCATA CAAAGCTACT TTTTATAATA CTTCAAGCCC CACATGAGCA GAAGCGTGAT 1560 

AAACAAGCAG AGAATACACC TATATAAGCG ATTAGTTGTT GATAGAATTC TGTTTCTGAA 1620 

ATACCTCTAT ACAAACAAAT GACAAACATA AAATCTGCCA AGCCGATAAA C AT AAGTTG A 1680 

TTGGTTCTAG GACTAACCAA ATCATCATTT ACTTATATTT AAGAGTATCT CTTTTATTTT 1740 

AATGTATCTT AGCACTGAAA AGCAAGACAG GCCAATAATA TTTAAAATGA ACAGTAACGG 1800 

GGTTAAGTCT CTAAAAAAAT TATCTACTGA CACTACAAGA AATACTATAC ATATTATAGT 1860 

CGAAACTATC TTTTTCTTAT CCATAATTAT TTACTCCTTT CCTAACAAAT CCAGCTTATC 1920 

AATCAAGACC GATTTTTAAC ATAATGTAGC AGCACCCGTT GCAACTTTGA CAAGTTTAGT 1980 

ATATCATTGT TTTTTAAAAT TTTTCATCCA AATCTTGAAT TGTCATCGAA ACATCTTGAA 2040 
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TTGTTAAAAA ATTTAAAAAG TAAGCATTAA AAACATACTT TCCTCTTTAT ATTGTATTGA 2100 

TACCAACTTG TTTGTAGACT TTTCATCCTG CTATCACATA TCATTTTGAC AGGCGAAACA 2160 

ATATTAAAGA AACTCCCCTG TAAATTAAGC TAGCAAATAC AGGCGAGAAA TTTATTTTTT 2220 

AGAGAGTACT ATCCGTATCC TTTTTGGAAG ATTTTGAAAA TATTTTTCTA ATTAAGTCAT 2280 

CCATATAACG ACCAAATATA CCAACTACTA AACCAATAAT AAAACTTTTA AAATCCATAA 2340 

TTACCACCAA CATATTGCTG CATAGGCTAC ACCTCCAAGT ATAGCTCCAC CTGCAGCACC 2400 

AGTTACACCT ATTCCTATAG CAAATGGTCC CAATAGAAAT GTCAAACCGT TGTTGCACAC 24 60 

CCATCAATTG CGCCATATGC AACCCCTGCT GCACAACTAA TTTTTCTTCC CCAATCAATA 2520 

TCTCCACCTT CAACGCAAGC AAGCATTTCA TTATCCATAA CTGCAAATTG TGACATCATT 2580 

TTTGTATCCA TATAGTGTAT CACTTTTCAG TTACGGAACA AGTTTAATAT AAAAATTATC 2640 

AAAAAAACAT AGGCAATAAA GAGAAAAATT AATTTATCAT AGATTAGAAA TAATATGACA 2700 

AAACAATTCA ATGATGTTAA TTCAATAGTC TTTTGTTTTT TATCGGAGAT ACTTATGGAT 2760 

AGATAAATAA GATAGGTTTG AAAAGCGAAG AGAATAATAA AGAATATAGC CTTCATAAAA 2820 

TTTAGCTTTC ATTTTTATGA TGTAGCGGTA TAGGCTAAAT ATCCACAAAC CACTGCTCCT 2880 

CCAATTCCTC CTATTGCAGC GCCCCATGGT CCTAGAACTC TCCCATATTT CACTCCACCC 2940 

GCTGCACAAC CTAAAGCAGC AACTACAGCT GCTCCTCCGG AATTACCTCC ATAAACCTCA 3000 

CTCAGCATTG TTTCATTTAT ATTACAATAA GTATTCATAC AAGTCTCCTT TTATTAAAAT 3060 

CCACCCCTTG CCCCTGTTAC TCCTGCCCAA AGATCCACAC CAAATTTAGC TCCTATGTAT 3120 

CCACATGCTC CCATAAATGG TGCTCCAACA CCACTCGCAG CACAAATAGC TGTCCCTAGC 3180 

CCCCAGCCAC CAAAAGCAGC ACCACCACCT TCTAAGACAT TAGTTTGCCA ATTATTCTTG 3240 

CCTCCTTCAA TACTAGATAA CATAGTTATA TCCATTTCAT GAAATTGTTC CATAATTTTT 3300 

GTATCCATGA CAAATACTCT TTTTTATTTT TAATTTTTGT CTTGTTGTAA CTTTGACAAG 3360 

TTTAGTATAT CATCGTTTTT TAAAATTTTT CATCCAGATT TTGAATAGTC ATCGAAACGT 3420 

CTTGAATTGC AAAAATTACA TTAGACTTCC TGCAAAACTA GAATCCTAGT TCATGATTGA 3480 

TAATACCAGC ACTCAAATTC ATTCGTAATC CGAAGCGTTT ACGATGACTT CGATAGGTTG 3540 

TTGAAAACAT TTTAAACGTT TTTACTTTGG CAAAGATGTT CTCAACCTTG CTTCTCTCCT 3600 

TAGATAGCGC ATGGTTACAG GCTTTATCTT CAACTGTTAG CGGTTTGAGT TTGCTGGATT 3660 

TACGTGAAGT TTGTGCTTGA GGATATATCT TCATGACCCC TTGATAACCA CTGTCAGCCA 3720 

AGATTTTACC AGCTTGTCCG ATATTTCTGC GACTCATTTT GAACAACTTC ATATCATGAC 3780 

AATAGTTCAC AGTGATATCC AAAGAAACAA TTCTCCCTTG ACTTGTGACA ATCGCTTGAG 3840 
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TCTTCATACC GTGAAATTTC TTTTTACCAG AATCATTCGC TAATTCTTTT TTTAGGGCGA 3900 

TTGATTTTTA CTTCCGTCGC ATCAATCATT ACCGTGTCCT CAGAACTGAG AGGAGTTCTT 3960 

GAAATCGTAA CACCACTTTG AACAAGAGTT ACTTCAACCC ATTGGCTCCG ACGGAGTAAG 4020 

TTGCTTTCGT GAACACCAAA ATCAGCCGCA ATTTCTTCAT AAGTGCGGTA TTCTCGCACA 4080 

TATTGAAGAG TGGCCATAAG AAGGTCTTCT AGGCTTAATT TAGGTTTTCG TCCACCTTTT 4140 

GCGTGTTTAA GTTGATAAGC TGTTTTTAAT ACAGCTAGCA TCTCTTCAAA AGTCGTGCGC 4200 

TGAACACCAA CAAGACGCTT AAATCGTGCA TCAGTTAGTT GTTTACTTGC TTCATAATTC 4260 

ATAGAACTAT AGTAAAATGA AATAAGAACA GGATAAATCG ATCAGGACAG TCAAATCGAT 4320 

TTCTAACAAT GTTTTAGAAG TAGAGGCGTA CTATTCTAGT TTCAATCTAC TATACTATAC 4 3 BO 

CATATTTTGT TTCGCAGGGA ATCTATTATA AAAGGGTAAG TATTGCAAAA ACACTTACCC 4440 

TTTTCTTTTA TACTTCATTA AGCTCTACTT TTTATAATAC TTCAAGCCCC ACATGAGCAG 4500 

AAGCATGATG ATTAAGCAGA GAACAGCGCC AATATAAGCG ATTATTTGTT GGTAGGATTC 4560 

TCCTGCTGTG ATACCTCTAT ACAAACAAAT AATAGACATA AAACCTCTCA AGCCGATGAA 4 620 

CATAAGTTGA TTGGTTCTAG GACTAACCAA ATCATCATCT TCAAACTCTC TTATCCTCAT 4680 

TTCCCTAGTG AGATAAACAG TAACCAAAAT AGAAGCCAAG TTAATAACTA CTAAAAGAAA 4740 

TTGGAAAACT ACGGAAAAAT TTAAAAACTG ACGAGATAGA AATAGATAAG TAGAAACAAG 4800 

CAAGGGCAAC TGACCTAAGA ACAATCTCGC AAGGAAGATG TTCCGTTTTT TAGCAAGAAA 4860 

AGTTTTCATT TCTTTTCTCC TTTCTTTTTA TTGATAGCAA AATAGATCAT AACTGCAATC 4920 

ACATAGGCTA TGGTATAAAA TAGCTGATAC CAAGCACTCT CCCTAAGCGG ATATAGAAAG 4990 

ATGGACATGA TTAGATACAG AACGAAAATA ATCAGTATTT TTTTCTTCAT AAGATTTCCT 5040 

CCTAAATGTG CGATTTATCT TAGTTGAGCA AGAACATTTA CACTGCTAGT ATAGCACTTA 5100 

TTTTGACCTT GGATCACTCA AATCATAAAT CGTCATCAAA ACCTCTTGAA TTGTAAAAAT 5160 

TAAAAAAGCA AGCATGAAAA ACATACTTTC CTCTTTATAT TGTATTGATA CCAACTTGTT 5220 

TGTAGACTTT TCATCCTGCT ATCACATATC ATTTTGACAG GCGAAACAAT ATTAAAGAAA 5280 

CTCCCCTGTA AATTAAGCTA GCAAATACAG GGGAGAAATT TATTTTTTAG AGAGTACTAT 5340 

CCGTATCCTT TTTGGAAGAT TTTGAAAATA TTTTTCTAAT TAAGTCATCC ATATAAGGAC 5400 

CAAATATACC AACTACTAAA CCAATAATAA AACTTTTAAA ATCCATAATT ACCACCAACA 5460 

TGTTGCTGCA TAGGCTACAC CTCCAAGTAT AGCTCCACCC GCAGCACCAG TTGCTGCACC 5520 

TTGCCATGTT CCTGTTTTAA TGCCTAGTTG AAGACCTCTT GCTGCTCCTC CTCCAACACC 5580 
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TGCTTTGGCA AAATCTCCCC AATTGCATCC GCCACCTTCA ACCCAAGCAA GCATTTCAGT 5640 

ATCCATAACA GAAAATTGTG ACATCATTTT TGTATCCATG ACAAATACTC CTTTTTTAAA 5700 

AAACTAAAAT AAATCAGAAT AGAATCCTCA TAATTTTACT ATAAGTCTTA CCAACTTAGT 5760 

CCCAATTTAT CACCAACCAT ACCTCCTAAG CATGTTAATC CACCCCCAAT TGCACCAATG 5820 

TGTGCTCCAA CAAATGCACC AGCAAGTCCA GCTACTCCTA AAGTGGCCAA ACCTGCTCCA 5880 

GTTCCACCAG TTATAATTCC CGTAGTGACT CCTGTAATCA GTGCATTTTG ACAATCAGTG 5940 

GAGCTATACC CCCCTTCAAC TTTCGCAAGC ATTTCAGTAT CCATAACCTC TAACTGTGAC 6000 

AACATTTTTG TATTCATGAT GAATACCTCC TTTTTATTTT CAATTTGTTA CCAAAGTCTT 6060 

AAATTCAATA AACAAATAGA TTTTTTATAG TATCTTTTTG ATTTTCTTAA AAAAGTATAT 6120 

ACGTCTACTA TCTTCTTAAA GGTAGCAGTA CCTATTTTTT AGTCTAAGAT TTCAATAATC 6180 

TTGAGTATCT AAAATATCTT AATTTCGTTA TTCTCCTTGC AATAAAAAGT TTTACTATAC 6240 

TATTTATTAA CTTGCAGAAA GCAAAAAATA TTAGTAAATA ATAGTTTATA GTTAAGTTTT 6300 

TTATTCCTAC CAATCCATCA ACTAAGTAAA GCATCAACGA TTACATAAAC GATTGATAAT 6360 

ATAATTAAAA TTTTGCTAAC TATCTTATTC TCATCATTCT TAGATAACTT TGATATTTTG 6420 

TAAGTAAGTA AATAAGACAC TAAATTAATA GCGATAATAA TACTATATTT AAGAATCATA 6480 

ATCTTACAAA GAGGACATAA TTCCTGAACC TACACAAATA AGTGTTGCTG CTCCCCCAGT 6540 

TATCGGACCA GTCGCAGCAG CTAATAGTAC TGCTCCAATA CAACCACCGA TTGCAGATCC 6600 

TAAATTGCCT CTTCCTCCAC TAACTATTTC GAGTTCTTCA TTATCCATAA CAGAAAATTG 6660 

TTCCATCATT TTTGTATTCA TGACAAATAC TCCTTTTTTC TTTTTTTATT TTTGTCTTGT 6720 

TGTAACTTTG ATAAGTTTAG TATATCA7CC TTTTTTAAAA TTTTTCATCC AGATCTTGAA 6780 

TTGTCATCGA AACGTCTTGA ATTAGCTTTT TTATTTCAAG CCACCTCTAA ATGTTTAAAA 6840 

AAAATAATTT CTAATCACTT TTTTACCATT CAGGAAGTTT TAATGACTAT TCAAGATTTC 6900 

ATAAAATATG AACTTAGTTT TATGACATAA TAGACCTATC CACTATATGA AAGGAATTGC 6960 

CAATGACTTC TTATAAACGT ACATTTGTTC CTCAAATAGA TGCGAGAGAC TGTGGTGTCG 702 0 

CTGCCTTAGC CTCGATTGCT AAATTCTATG GTTCAGATTT TTCTCTAGCT CACTTGAGAG 7080 

AACTTGCAAA GACCAATAAA GAAGGGACGA CTGCTCTTGG CATTGTAAAA GCCCCTGATG 7140 

AAATGGGCTT TGAAACAAGA CCTGTTCAAG CAGATAAAAC GCTCTTTGAC ATGAGTGATG 7200 

TCCCCTATCC ATTTATCGTT CACGTTAACA AAGAAGGAAA ACTCCAACAT TACTATGTTG 7260 

TCTATCAAAC AAAGAAAGAC TATCTGATTA TTGGTGATCC TGACCCTTCT GTAAAAATCA 7 320 

CTAAAATGTC AAAAGAACGC TTTTTCTATG AATGGACTGG AGTAGCTATT TTTCTAGCTA 7380 
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CCAAACCCAG CTATCAACCC CATAAAGATA AAAAGAATGG TCTACTAAGC AAGCTTCCTT 7440 

CCTCTGATTT TCAAACAAAA ATCTCTCATT GCTTACATTG TTCTCTCAAG CTTATTGGTC 7500 

ACTATTATCA ATATAGGTGG TTCTTACTAT CTCCAAGGAA TCTTGGATGA ATACATTCCA 7560 

AATCAGATGA AATCAACTTT AGGAATCATC TCAGTTGGTC TGGTTATCAC CTATATCCTC 7620 

CAACAAGTCA TGAGCTTCTC CAGAGATTAT CTCCTAACCG TTCTGAGTCA GAGATTAAGT 7 6 SO 

ATTGATGTGA TTTTATCCTA TATTCGCCAT ATTTTTGAAC TTCCCATGTC TTTCTTTGCG 774 0 

ACACGTCGTA CAGGAGAAAT CATTTCACGA TTCACAGATG CTAACTCTAT TATAGATGCC 7800 

TTGGCTTCTA CCATTCTTTC TCTTTTTCTG GATGTTTCTA TTCTGATTCT TGTAGGAGGC 7860 

GTCTTACTGG CACAAAACCC TAATCTCTTC CTTCTTTCTC TTATTTCCAT TCCTATATAC 7920 

ATGTTCATCA TCTTTTCTTT TATGAAACCT TTCGAAAAAA TGAACCATGA TGTCATGCAA 7980 

ACTAATTCTA TGGTTAGCTC TGCCATTATC GAAGATATCA ACGGGATTGA AACTATAAAG 8040 

TCGCTCACGA GTGAAGAAAA TCGCTATCAA AATATAGACA GCCAATTTCT AGATTATTTG 8100 

GAAAAATCCT TTAAGCTCAG TAAATATTCT ATTTTACAAA CGAGTTTAAA GCAGGGAACA 8160 

AAATTAGTTC TGAATATCCT TATCCTATGG TTTGGCGCTC AATTAGTCAT GTCAAGTAAA 8220 

ATTTCTATCG GTCAGCTGAT TACCTTTAAC ACACTTTTTT CTTACTTTAC AACTCCTATG 8280 

GAAAATATTA TCAACCTCCA AACCAAACTC CAATCTGCGA AGGTCCCTAA TAACCGTTTG 8340 

AACGAAGTCT ATCTAGTCGA ATCTGAATTT CAAGTTCAAG AAAACCCTGT TCATTCACAT 8400 

TTTTTGATGG GCGATATTGA ATTTGATGAC CTTTCTTATA AGTATGGTTT TGGATGAGAT 8460 

ACCTTAACAG ATATTAATCT CACGATTAAA CAAGGAGATA AGGTTAGCCT AGTTGGACTT 8520 

AGTGGTTCTG GTAAAACAAC TTTAGCCAAA ATCATTGTCA ATTTCTTTCA ACCCTACAAA 8580 

GGGCATATTT CCATCAATCA TCAGGATATT AAAAACATTG ATAAAAAAGT CTTGCGCCGT 8640 

CATATTAATT ACCTACCCCA ACAAGCCTAT ATCTTTAATG GCTCTATTTT GGAAAACTTN 8700 

ACCTTGGGCG GTAATCATAT GATTAGTCAA GAAGATATTC TAAAAGCTTG TGAAGTAGCT 8760 

GAAATCCGTC AAGACATTGA AAGAATGCCT ATGGGCTATC AAACTCAGCT CTCTGATGGA 8820 

GCTGGTCTAT CAGGAGGACA GAAGCAACGA ATCGCTCTCG CTCGTGCTCT TTTAACTAAA 8880 

TCTCCTGTTT TAATACTAGA TGAAGCTACT AGCGGTCTTG ATGTCTTGAC TGAGAAAAAG 8940 

GTTATAGATA ATCTTATGTC TCTAACTGAT AAAACCATTC TCTTTGTAGC CCATCGTCTC 9000 

AGTATAGCCG AACGAACCAA CCGTGTCATT GTTCTTGACC AGGGGAAAAT CATTGAAGTT 9060 

GGTA 9064 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7780 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(0) TOPOLOGY: linear 



{xi> SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CTCCATTTTT TTGATTTCAT AAATAAACAA CCTCTCTGTT AATTTTGTAT AATTATAACG 60 

ATATCCAAGT TACTTGTCAA GTGTTTTTTA AATTTTTATC TCAAAAATAT TTTTTCGTTC 120 

AAAAAAAGGA GCCATCAGTT GATTTCAAGC TCCCTTTTAT ACAGAATTAA ACTATTTTAT 180 

AGTTCGACAA TCTTACCTGT TTCAAAGTAG ACAACCCATT C AC AG AT ATT TTTAGCATAG 240 

TCACCGATAC GCTCCAAGTA GGAAATAACT TGGAAATAAT CACGACCCGT AACAATGGCT 300 

TCTGGATTTT TCTTAATCTC TTCAGTCGCA AGGTCACGGA TAGTTTCAAA ATAGTGGTTA 360 

ATTTGCTCAT CCATGGAGGC CACCCGGTAT GCGTCGTCAA CAGAACCATT AAGATAAAGA 420 

TCAAGTGCTG CTTCCACAAC GCTTTTAACT TCACGTCCCA TTTTTTTAAT TTCTTCCTCT 4 80 

ACAGCTGGAA TGCGCTCTTC CCCCTTCATA CGGATGGTTG CCTGCGCAAT GGCTACAGCG 540 

TGATCCCCCA TACGCTCCAC ATCTGATACA GCCTTAAGGA CAGTCAAGAC TGTACGCAAA 600 

TCTTGAGAGA CTGGTTGTTG GAGTGCGATC ATTTCAAATG ATTTCTTTTC CAGTTTCACT 660 

TCGTATTCAT TTACTTCTGC ATCATCTTCG ATGACCTCTT TTGCCAGGTC ACGGTCATGC 720 

GTGACAAAAG CACGTACCGT ACGATTGATT TGTGAGAGCA CTTCTTGTCC CATAGCGTAG 780 

AACTGGTTAT GTAATTTCTC TAAATCTTCT TCAAATTGAG ATCGTAACAT CTTTCATCTC 840 

CTTATCCAAA TTTTCCTGTA ATATAGTCTT CCGTTTCCTT GTGTTGGGGA TCAAGGAACA 900 

TCTGCTTGGT ATCATTAAAT TCAATCAAAT CTCCATCTAG GAAAAATCCT GTCTTATCAG 960 

AGATACGTGA AGCTTGCTGC ATGGAACGGG TTACCAGAAG CATGGTGTAC TTGTCTTTTA 1020 

GACCATACAA GGTTTCCTCA ATTTTACCAG CTGAAATCGG ATCCAAAGCC GAAGTTGGCT 1080 

CATCCAAGAG GATGATTTTA GGACTAGTTG CCAAGACACG GGCCACGCAG ACACGCTGCT 1140 

GTTGACCACC TGACAATCCA ATAGCTGAAT CATATAGACG ATCCTTGACC TCATCCCAGA 1200 

TAGAGGCACC TTGCAAGGCT TTTTCTACGG CTTCATCCAG AACCTGCTTA TCCTTAATTC 1260 

CATTGATACG AAGCCCGTAG ACAACATTCT CATAGATAGT CATAGGGAAA GGATTAGGTT 1320 

GTTGGAAAAC CATTCCGATT TCCTTACGTA ATTCAACCGT ATCTGTACGC GGACTGTAGA 1380 

TGTTGTGACC ATTGTACACC ACGGATCCAG TTGTGGTCAC CTCTGGATTG AGATCTCCCA 1440 
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TGCGGTTGAG AGACTTGAGG AGGGTTGACT TCCCTGATCC AGATCGACCA ATCAAGGCTG 1500 

TAATTTCCTT AGGTTGGAAA GATAGGGAAA CACTATTCAA AGCCTTCTTT TTATTATAAT 1560 

AAACGGACAG GTCTGATACC TGTAAAATCG CATCTGTCAT ACGGTTTCCT TTCTAACCAA 1620 

AGTGACCAGA TACATAGTCA TTGGTGGACT GTAGCTTGGC ATTTTGGAAA ATAGTTGCAG 1680 

TCTTGTCATA CTCAATCAAA TCACCCAAGT AAAAGAAGCC TGTATAGTCA CTTGCACGAG 1740 

CAGCCTGCTG CATATTATGC GTTACAATGA TGATGGTAAA GTTTTTCTTG AGCTCAAACA 1800 

TGGTCTCTTC TAGTTGCATG GTCGCAATCG GATCCAAGGC TGAGGCTGGC TCATCCATTA 1860 

AGAGGATATC TGGCTTAACA GAGATGGCAC GAGCGATACA GAGACGTTGT TGCTGACCAC 1920 

CTGATAAGGT CAAGGCTGAC TTGTGGAGAT CGTCTTTAAC CTGATCCCAG AGGGCAGCCT 1980 

GACGAAGGGA GGTTTCTACG ATTTCATCTA GGACTTGCTT ATCCTTAACT CCAGCACGTT 2040 

CATGCGCAAA GGTAATATTA CGGTAAATTG ACTTAGCAAA TGGATTGGGA CGTTGAAAAA 2100 

CCATTCCAAT GTGTTTACGC ATTTCATAAA CGTTGATTTC TGGACGGTTG ACATCAATTC 2160 

CACGATAGAG AATCTGCCCA GTTACTTTAG CAATATCAAT AGTATCATTC ATGCGATTGA 2220 

CACTGCGTAA GTAGGTACAT TTCCCCGATC CCGACGGGCC AATCAAAGCT GTAATTTTAT 2280 

TTCTTTCAAA TTGCATATCA ATCCCCTTAA TGGATTCATT TTTACCATAG TAAACATGGA 2340 

CATCCTTAGT AGAAAGGGCT ACTTTTTCTT CAGGAAAGGT AAGGATATGC TTCTCATCCC 2400 

ACTTATATGT TGACATGGCT TCTCCTTTAG GCAGCGGTTA ATTTCTTGTG TAGATAGCTT 2460 

CCGAACTTAC GAGCTCCAAA GTTAAAAATC AGGATAAAGA TCAGCAGCAC AGCGGCAGAA 2520 

CCTGCTGATA CAATGGTTCC ATCTGGAATA GTGCCTTCAC TATTGACTTT CCAGATATGG 2 580 

ACAGCCAAGG TTTCTGCTTG ACGGAAGATA GAGATGGGGC TAGTCACACT GAGGATATTC 2640 

CAGTTAGACC AGTCAAGAGC TGGCGCCGAT TGGCCTGCTG TATAGATCAG AGCTGCAGCT 2700 

TCGCCAAAGA TACGACCAGA TGCCAAGACG ACACCCGTTA CAATACCTGG AAGCGCTTCC 2760 

GGAATAACAA CATGAACCAC TGTCTCCCAG CGAGAAATCC CAAGAGCCAG ACCAGCCTCA 2820 

CGTTGGGTAT GGTGAACGTG TTTCAAACTA TCCTCTACAT TACGCGTCAT CTGAGGCAAG 2 880 

TTAAAGACTG TCAAGGCCAA GGCACCTGAA ATGATTCAAA ATCCATACTC AAACTGGACT 2940 

ACAAAGATCA AGTAACCAAA GAGACCCACC ACCACTGATG GTAAAGAGGA CAAAATTTCA 3000 

ATACAAGTCC GCACAAAGTT GGTAACAGGA CCTTTTTTAG CATATTCAGC CAAGTAAATC 3060 

CCAGCTCCCA TAGAAAGAGG TACAGAAATA ATCAAGGTAA TGACCAATAG GAAAAAGGAA 3120 

TTGTAAAGCT GAATGCCAAT CCCACCACCT GCTTGAAAAG CAGAAGACCT TCCAGTCAAG 3180 
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AAAGACCAAG AGATATGGGG CAAGCCCCGA ACCAAGATAT AGAGAATCAA GGAAGCCAAG 3240 

ATTGTCACAA TGATGCTAGC AATCGTATAG AGGACAGCTG TTGCAAGTTT ATCTAATTTC 3 300 

TTAGCGCGCA TAATTTTTCT TTCCTCTTTC TTTCGTAATC AATTTAATCA CACTGTTAAA 3360 

AACTAAGCTC ATCAAGAGCA GTACCAAGGC CAGTGACCAG AGAACATTAT TATTTACAGT 3420 

TCCCATGACA GTGTTCCCAA TTCCCATAGT TAATATAGAA GTTAAAGTTG CAGCTGGTGT 3480 

GGTCAAGGAA CTTGGCATAA CAGCTGAGTT TCCGACAACC ATCTGGATAG CTAGAGCCTC 3540 

ACCAAAGGCA CGCGCCATCC CAAAGACCAC TGCAGTGAAA ATACCAGAAC GGGCCGCCTT 3 600 

CAAGATCACA CGCCAGATAG TCTGCCAGCG AGTGGCTCCC ATAGCGAAAC TGGCTTCACG 3 660 

ATAATAACGA GGAACCGCAC GCAAGCTATC CGTTGTCATA AAGGTTACGG TCGGCAAAAT .3720 

CATGACAAAG AGGACGGAAA TCCCTGACAA AATCCCAAAA CCAGTCCCAC CAAAGACACT 3780 

GCGAACAAAG GGAACGACGA CTTGCAAGCC AATAAATCCG TACACTACTG AAGGAATCCC 3840 

AACCAGGAGT TCAATAGCTG GTTGCAAAAT CTTCGCCCCT TTTGGTGATA CTTCGGTCAT 3900 

AAAAACTGCT GCACCAATAG CAAAGGGTGT TGCGATAAGG GCTGAGAGAA TGGTAACGAT 3960 

AAAGGAACCC AAAATCATAG GAAGGGCACC AAATTCTTTA CTAGAAGGAT TCCAAGTTCC 4020 

TCCCAAAAGA AAGTCAAAGA TATTCACACC ATTGACAAAG AAGGTCGACA AGCCTTTTTG 4080 

CGCTACGAAA ACCAAAATCA TGGCCACAAG GATGACTATC AAAGAAAGAC AGGCAAAGGT 4140 

CAAACCTTTT CCTAATTTCT CCAGACGAGA ATTCTTTGAT GGAAGCAACA TTTTCTTAGC 4 200 

TAATTCTTCT TGATTCATTA TTGTCTCCCT TCCAACACTG TCACAGTTCC GGCAGCATCT 4260 

TTTTCAACCT TCATTTCCTT AATCGGAATA TACTTCAATC CTTTGACAAT CCCTTCTTGG 4 320 

CTCTCATCCG AGAGAACAAA ATTGAGAAAT TCTGCAGCCA ACTCATTCGG CTGCCCCAAT 4 380 

GTATACATAT GCTCATAAGA CCACAAGGGC CAATTATTGC TACT TAT ATT TTCTGGACTT 4 440 

AAGTCATAGC CATTCAACTT CATGCTTTTG ACCGAATCAT CTATATAGGT AAGAGATAAA 4 500 

TAAGAGATAG CTCCTGGACT TTTTGATACG ATTGATTTTA CCCCTCCATT TGAATCCTGC 4 560 

TCCTGACTTT GCATGGCAGA CTGACCTTCC ATAATGACAG TATCAAAGGT AGCACGAGAG 4620 

CCAGAGCCGG CTGCCCGATT GATAACAGAG ATGGGTAAGT CCTTACCACC AACCTCTTTC 4 680 

CAATTGGTTA CCTCACCTAT GAAGATTTGA CGAAGTTGCT CTGTCGTTAG GTTATCAACA 4740 

TCAACCTCCT TATTGACAAT CAGAGCCAAG CCAGCTACCG CGACCTTGTG GTCAACAAGA 4800 

GCAGAAGCAT CAATTCCGTC TTTTTCCTCA GCAAATACAT CTGAGTTTCC TATATCAACT 4860 

GCCCCAGACT GAACCTGGGA CAAGCCTGTA CCAGAACCTC CCCCTTGGAC ATTGACCGTT 4920 

TTTCCAACAT GGATCGTGCC AAATTCATCT GCCGCTACTT CAACCAAGGG. TTGCAAGGCA 4980 
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GTTGAGCCAA CAGCCGTTAT GGATTCTCCA CGATCAATCC AGCTAGCACA GCCTACTAAA 5040 

CAAGCCGTCA GCCAAAAAGC GATAAGAGAC AGAGCAAGCT TTTTTCTTTT TTTCACTGTT 5100 

TTTCTCCTCG AAAATAATTA TGAATACTGT GAATTTTTTA AGTAGTTCTT TATGAGTTGA 5X60 

CGCATGAATT CTTACCAAAT TTCTGCGCAA TTGATTATTT ATATAATATA GGCTATATTA 5220 

CTCTTTCCTA ACCTCCTTTT TTCATATGTG GATAAAATCT CTTGTCTATC CCTTCCCCCA 5280 

TTGTCACCCA TTATAGTCAT TTCGTGTCTC TTTTTCCCCT TTTT AATGC A AGGGAAATTA 5340 

CTCTCCTTAG ATGATAATCC AAAAGCTAGA AAGGTATCTC AAACCTCTCT ACTCTCCCAG 5400 

ACTAGTTTAC AACTAAAAGG AAAAGATTCT ATTTTATGAG AAATCTAGTT TACAAGCGGT 5460 

AAGAACGCTA ATAACTAAAC TTCTTGTACT CTTTGAAAAT CTCTTCAAAC CAGTGTTTTG 5520 

AGCTATCTAT GGCTAGCTTC CTAGTTTGCT CTTTGATTTT CATTGAGTAG TAAAACTACA 5580 

TGTAATGGCA ATCAAGATAT CAAGAATCAT CCTACTAAAA AAATCCATAC TTTCACTATA 5640 

ACATAGAATA AGATATTTGA CTAGCATTTT CATTTGAATC TGAGGCCTTT TGGAAAATAA 5700 

TTTTTCAAAA CATTTCCAGT AACCTTTGCA AAGCCCAAGC CATTGCCTTT AACCAAAACT 5760 

TGGTACCAAC CATTTGGCAG ACTTTCTGCC AGCTGAACGG TTTCTCCAGC CGCATACTTG 5820 

ACAAACGCTT CTTGGCCAAT TTCAACCGAC TGTTCGACCT GACTCGGTTT CAAGGCTAAA 5880 

CCAAGAGCGA AACTGGGCTC AAAGCGTTTC TTCTTAAAAG TACCCAGATG CAGTCCATTG 5940 

CGAGCAATCT TGAGCTTCCA TAAATCTGGC AAAAGTTCTG GCAAGAGATA AAGCTGGTCT 6000 

CCAAAAATCT GCAAGATACC CGGTAGATTG ACCTTCAAAT GGTTTTGGGC AAATTCCTGC 6060 

CACAAGGCAA CTTGTTCACG GCTGAGGTTA CTCTTACTTG CCTTAAATTT AGGAGCTGGA 6120 

TTGTTACCCT TAAACTGTAG ATGGGCAACA AACTGACCCT CTCCCTTAAA CTGATGAGGA 6180 

TACATCCGAG CCGTTTCTGG CAGGTCAATA CCAGCTACCA TTCCATTGAT ATGCTCTACT 6240 

GGCAACAAGT CAAAATCATA CTCTTCCAGC AACCAATTGA CAATCTCTTC GTTTTCCTCG 6300 

GGTGCCCAGG TACAGGTCGA ATAAACCAGA TGACCACCTT CAGCTAACAT GGTCACTGCA 6 360 

TCCTCCAGAA TTTCTCTTTG CAAGCTAGCA CATTGACTCG GATAATCTAA GCTCCAATAG 6420 

TCCATAGCAT CAGGTTGCTT ACGAAACATT CCTTCACCAG AGCAAGGGGC ATCAAGAACG 6480 

ATTAAGTCAA AATAGCCTTT AAAGACCTTG ACCAAGCGGT CGGCACATTC ATTGGTCACC 6540 

ACGACATTTG TCGCTCCAAA ACGCTCCATG TTTTCAACCA AAATCTTAGC CCGTTTGCTT 6600 

GAAATTTCAT TGGAAnCAAG TAGCCCCTCC CCTGCTAGAT AGGCTGCCAG TTGAGTTGAT 6660 

TTGCCCCCCG GTGCAGCAGC CAAGTCCAAG ACCTTCATAC CAGGACTGGG TTGGGCTACT 6720 
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TGAGCCACCA TTTGAGCAGC AGGTTCTTGC GAATAAACTA AACCTGTAGC ATGCTCACGC 6780 

GATTTCCCTG AAACCTTCCC ATAGTGGCCC CAAGGGGTTT GAGTAATGGC ATCAGAAAAG 6640 

GAAAGTTGCT CTTCTTTTAA GGGATTGACC CGAAAGGCCG AAACCGCTTC CTCCTCAAAA 6900 

GAGGCAAGAA AATCTCTTGC CTCATCTCCT AGTATCTCTT TATATTTTTC AACAAATCCT 6960 

TCTGGAAATT GCATTTAAGT TCTTTTCCTT TCGTAAATAT AGGACTGAAT TTCCTCCTGC 7020 

ATCTCAAGAG GCACCATCAT GACCGGCTGT CTGGTTTGAA AATCACGAGC TTCACCAAAA 7080 

AGGGTCACAA CCCGATAGCC CAGACTTTCC CCTAAAATAC TAGCTGCGGC ATAATCCCAT 7140 

GGTTGCAGAT AAGTGAGATA GGTCAACAAA CGCCCTGACA AAATCTTGGC AAAACTAATG 7200 

GCCGCACTTC CATAGACACG AACACCAAGA ACCGCTCGGC TCAAATCACC CAGCCCCCAT 7260 

TCATTGGTTT CCAGCATACC ACTATTCCCT GCAATGAGAA AATCTCCAAG TGGTTTAGTT 7320 

TTAAAAGGAG CTAGGGACCT ATCATTTAGA CAAACTGGAA ATTCCCCACC ACCGTGGTAA 7380 

CAATCCCCTT TGACCACATC ATAAATCAGA CCAAACTGTC CCTGACCATT TTCAAAATAA 7440 

GCCATCATAA CAGCAAAATC TTCCTGCTGG GCTACAAAAT TATTGGTACC ATCAATGGGA 7500 

TCAATGACCC AAACCTTCCC CTCTTGAACC GAGGCTCCCA GACAACCTTC TTCAGCACAA 7560 

ATCTTATCCT CAGGATAACG GGACAAAATC TCACCAACCA AGAGTTCCTG AACTTCTTTG 7620 

TCCAGTCTGG TCACCAAATC TGTTGGAGAG GACTTGGTTT CAACACGCAA GTCTTCCTGC 7680 

ATATGGTCAA GAATGTACTG ACCTGCTTTC TTAACAAGCT CTTTAGCAAA TTCAAATTTA 7740 

CTTTCCAAGA GAAATCTTTC CTTCCCCTTT TTCTTTGGGG 7780 

(2) INFORMATION FOR SEQ ID NO: 19: 

( i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4820 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS : double - 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GTAATGATAT AGGAACACCA GGTGACCTGA TGGGACGTCG TAAGCCTATG AACTACTAGC 60 

TGCTAAAGGC TTTAAAGATG GTATGGTACC ATATATCTCA AACCAATACG AAGAAGAAGC 120 

CAAACAAAAG GGCAAGACAA TCAATCTCTA CGGTAAAACA AGAGGTTTGG TTACAGATGA 180 

CTTGGTTTTG GAAAAGGTAT TTAATAACCA ATATCATACT TGGAGTGAGT TTAAGAAAGC 240 

TATGTATCAA GAACGACAAG ATCAGTTTGA TAGATTGAAC AAAGTTACTT TTAATGATAC 300 

AACACAGCCT TGGCAAACAT TTGCCAAGAA AACTACAAGC AGTGTAGATG AATTACAGAA 360 
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ATTAATGGAC GTTGCTGTTC GTAAGGATGC AGAACACAAT TACTACCATT GCAATAACTA 420 

CAATCCAGAC ATAGATAGTG AAGTCCACAA CCTCAAGACA GCAATCTTTA AAGCCTATCT 4 80 

TGACCAAACA AATGATTTTA GAAGTTCAAT TTTTGAGAAT AAAAAATAGT GTCTACTATT 540 

AGGAAATAAA GTTTAAAAAG GTGATGAAGA ACAAACCAAG ATTCAAGCAG GAATTCCTAC 600 

TGATAATGAA GTAAGTTATG ATCTTATTTA TCAGCAGGAA ACTCTTCCTG CAACAGGTTC 660 

ATCAACTTCT GAGCTTACAG CTTTAGGCCT ATTAGCTGTT GGTAGTTTAG TTCTTTTGGT 720 

TCATAATATG ACGGGAACAG TTTTTTGCTC CCTCTCAAAA GTCATCATTT GATGGCTTTT 7 80 

TTCTATATAG GGTAAAAGAT AGGGTAAAAG GCTATCATCG GACAAAATAA AGAAGGCATG 840 

ATATAATATA AAGTAGATTT CTATGTCATA AAACAAGAAC TGTTTGGACA TCATTCATTT 900 

GAAAACTCTC TATGTTCAAA CAATAGTAAA ATAAAATAGG GGATCTAAAT CCTTGCTATG 960 

AAAGGAAAAA ACTCAATGGC TACTATTCAA TGGTTTCCTG GTCACATGTC TAAAGCTCGT 1020 

CGACAGCTGC AGGAGAATTT AAAATTTGTT GATTTTGTGA CGATTTTAGT AGATGCACGC 1080 

TTGCCTCTAT CTAGTCAAAA TCCTATGTTG ACCAAGATTG TTGGTGATAA ACCAAAACTC 1140 

TTGATTTTAA ACAAGGCCGA CTTGGCTGAT CCAGCAATGA CCAAGGAATG GCGTCAGTAT 1200 

TTTGAATCAC AAGGAATCCA GACGCTAGCT ATCAACTCCA AAGAGCAAGT GACTGTAAAA 1260 

GTTGTAACAG ATGCGGCCAA GAAGCTCATG GCTGATAAGA TTGCTCGCCA GAAAGAACGT 1320 

GGGATTCAGA TTGAAACCTT GCGTACTATG ATTATCGGGA TTCCAAACGC TGGTAAATCA 1380 

ACTCTGATGA ACCGTTTGGC TGGTAAAAAG ATTGCTGTTG TTGGAAACAA GCCAGGGGTC 1440 

ACAAAAGGTC AACAATGGCT TAAAACCAAT AAAGACCTGG AAATCTTGGA TACACCGGGG 1500 

ATTCTCTGGC CTAAGTTTGA GGATGAAACT GTTGCACTTA AGTTGGCATT GACTGGAGCT 1560 

ATCAAAGACC AGTTGCTTCC TATGGATGAG GTTACCATTT TTGGTATCAA TTATTTCAAA 1620 

GAACATTATC CAGAAAAGCT GGCTGAACGC TTCAAACAAA TGAAAATTGA AGAAGAAGCG 1680 

CCTGTGATTA TTATGGATAT GACCCGCGCC CTCGGTTTCC GTGATGACTA TGACCGTTTT 1740 

TACAGTCTCT TCGTGAAGGA AGTCCGTGAT GGCAAACTCG GTAACTATAC CTTAGATACA 1800 

TTGGAAGACC TCGATGGCAA CGATTAAAGA AATCAAAGAA TTCCTTGTCA CAGTCAAGGA 1860 

GTTAGAAAGC CCTATTTTTT TAGAGCTTGA AAAGGATAAT CGCTCAGGAG TTCAAAAGGA 1920 

AATCAGCAAG CGTAAAAGAG CCATTCAAGC TGAATTAGAT GAAAATTTGC GCTTGGAATC 1980 

CATGCTTTCT TATGAAAAAG AACTTTATAA GCAAGGATTG ACCTTAATTG CAGGTATTGA 2040 

TGAGGTTGGT CGTGGTCCTC TTGCTGGTCC TGTAGTCGCT CCGGCCGTTA TTTTATCTAA 2100 
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AAATTGTAAG 


ATTAAAGGTC 


TCAACGACAG 

4 v— . *»#» v» vj nvr»v 


CAAGAAAATT 


CCTAAAAAGA 


AACATCTGGA 


2160 


GATTTTPCAA 


GCCGTTCAAG 


ACCAAGCCTT 

V^i V»#W VVJ^V* V^ 4 4 


GTCGATTGGA 


ATTGGTATCA 


TAGATAATCA 


2220 


GGTCATCGAC 


PAAGTCAACA 


TCTATGAAGC 


AACCAAACTA 


GCCATGCAAG 


AAGCAATCTC 


22B0 


CCAGPTCAGC 


CCTCAACCAG 


AGCACCTTTT 


GATTGATGCC 


ATGAAACTGG 


ACTTGCCCAT 


2340 


TTP APAAA.PP 


TPP ATT ATC A 


AAGGAGATGC 


CAACTCCCTC 


TCTATCGCAG 


CAGCATCTAT 


2400 


agt ifirrnAfi 


GTAAPACGTG 


ATGAATTGCT 

m \ 4 «V*» ^ 4 VJ V«» 4 


GAAAGAATAC 


GATCAGCAGT 


TCCCTGGCTA 


2460 




APT A. ATfif AG 


GATATGGCAC 


AGCTAAACAT 

4 ****** V^J* 4 


CTGGAAGGCC 


TCACAAAACT 


2520 


agiiapttapp 


PC* A ATTRACT. 


GAACCAGCTT 


TGAACCCGTT 


AAATCACTGG 


TTTTAGGTAA 


2580 


k a & Af a a a/^*t 


T A ATTP A A AG 


flAAATAAPAT 


CCAtV^AACAn 


TPGGAAATAG 


TCCGTTCTAA 


2640 




rpi w ?'Mr A 

LfV_k_ 1 i luvn i 


f*PA*fIPAC*TAT 


in i wv*wvi 


GTTGGTCGAG 

V 4- 4 4 W Uv%W 


GAATCATTGT 

Urtfi 4 V* • • 4 A A 


2700 


UvmUU 1 \-A 1 v_ 


PTTP P A A TT A 


TPGTCGGATC 


4 4 4 Vv\J 4 4 4 V* 


TTAATTGAAA 

4 4 4 4 \J 


AGGGCTTCCA 


2760 


PI^PTlTifA A 

fvjn I AtAA 


L»\jAvj 1 1 1 A 1 v- 


A AG ATP A AGG 


PTAPTT AGTG 


r*fiP A ATT^TTT 

\— 4 V« 4 4 4 


TTGTACTGGT 


2820 


T Hu i I I 1 A 1 


iTinr atpt 

Al Al_ iV-rti v. 1 


GTTGGPTPAG 


TGCrAAACTA 


ACACGGTCAG 


AAAAAGATAT 


2880 


TAAAGGCTQ.A 




A AGTPG A AGP 


PGAAPTGAAA 


GGPPTPATGT 


PPCTCAACTC 


2940 




PTTTW* & A A A 


A AT ATGTGPT 


AGGT ATTPTT 


GCTATTGCCA 


GTGGACTCAT 

VJ 4 VJ\/*»V* -a 4 


3000 




f* APPP A PPP A 


PPATTPAAPT 


TGGAGP AGTT 


^j/vj 4 vjvj 4 nnnu 


GAATTGCCAA 


3060 




TPPAPTPPAG 


TAGAGGAACG 


TTCCTTGATT 

4 4 X_ X* A 4 Wrl 4 4 


GCCAGTGGAG 


CTGCAGCAGG 


3120 


TTTAv»\.Lvjt-A 


vj<_1_ 111 AA 1 \J 


PTPPTATTGP 


AGPAPTTCTP 


TTTK TTfll* AG 

4 X 4 W A 4 \J 4 f*V 


AAGAAGTCTA 


3180 


t*p a r*p A TTT T 
1 LALUnl ill 


l 1_U\_Vj>V„ till 


TPTGGGT PTC 


AACTCTAGCA 


GCCAGCATCG 

XJ. V« Via m\ Vs mm m V»» 


TAGCAAACTT 


3240 


lu 1 vj I V_ ILIA 


k, i 1 ul i V. v> 


GTTTGAPACC 


AGTATTGGAT 

f^\J 4 n 4 4 \^*wn 4 


ATGCCAGATA 


ACATTCCTCC 


3300 




uA 1 Lau 1a1 1 


GGATATATPT 


CGTCATGGGA 


^> 4 4 4-4 W *■ Ai 


GATTTTCAGG 

\J* A A * * • V* 1 * *%#V#' 


3360 


1111 \J\ v. 1 Al 


PAGAAAGPTG 


TATTAAACGT 


TGGAAGAGTT 


TATGACTTGA 

4 ■ ■ 4 4 4 *W*r» 


TTCGTCAAAA 


3420 


AATCCATTTG 


GATAGGGCTT 


ATTATCCCAT 


CTTGGCTTTT 


ATCCTTATCA 


TACCAGTCGG 


3460 


AATCTTCTTA 


CCTCAAATCA 


TTGGTGGCGG 


AAATCAGCTT 


GTCCTTTCTT 


TAACTGAACA 


3540 


AAATTTTAGT 


TTCCAAGTTT 


TATTAGCTTA 


CTTTTTAATC 


CGCTTTATTT 


GGAGTATGAT 


3600 


TAGCTATGGA 


AGTGGACTGC 


CAGGAGGAAT 


TTTCCTCCCC 


A'niTAGCTC 


TTGGTTCTTT 


3660 


GCTTGGTGCC 


TTAGTTGGTG 


TTATCTGTGT 


CAATCTTCGA 


CTTGTCAGTC 


AAGAGCAATT 


3720 


CCCTATATTT 


GTCATTCTAG 


GAATGAGTGG 


CTATTTTGGA 


GCCATATCAA 


AAGCTCCCTT 


3780 


AACCGCTATG 


ATCCTCGTAA 


CTGAGATGGT AGGAGATATT 


CGCAACCTTA 


TGCCACTTGG 


3840 


TCTTGTCACT 


CTTGTTTCTT 


ATATTATCAT 


GGATTTGCTC 


AAAGGTACGC 


CAGTCTATGA 


3900 
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AGCCATGCTG GAAAAAATGC TTCCAGAAGA AGTATCTAGC GAAGGAGAAG TTACACTTAT 3960 

CGAAATACCA CTTTCTGATA AAATTGCTGG GAAACAAGTT CATGAACTCA ACTTACCACA 4020 

CAACGTCCTC ATCACAACTC AAGTCCATAA TGGCAAGAGC CAAACAGTTA ACGGCTCAAC 4080 

CAGAATGTAT CTGGGTGATA TGATTCACCT GGTTATTCCA AAAAGTGAAA TTGGAAAAGT 414 0 

CAAAGATTTG TTGTTGTAGT ATGAGTATTT ACATAATTTA TGTTATGTAA ATGATCAGTT 4 200 

TGATTTATTT AGAAAACCGA TTCTCAGGAA TGAGATCGGT TATTTTTTAC TGATGAGGAA 4260 

TTTTACATAT AAATAATTGA ACTTTATTAA AAATAAGACT ATAATTAAGT TAGAAATGAT 4 320 

AAAGTATAAA GCTAGAAAGG AGTTTACTGT ATCAAATCTG TACAGTAAGA TTAAAATCAT 4 380 

GAAAAAGAAA ACAATAGCAA TTATATAGAG AAATGAAATA GAAATAGGAT AAAACAATCA 4440 

GGACAATCAA ATCAATTTCT ACCAATGTTT TAGAAGTCCA GATGTACTAT TCTAGTTTCA 4500 

ATCTATTA7A CAATGTGTTT TGTATCTCAT AGCTCCTTAT ATAGCTCTTC AGTTATGTAG 4560 

TATTAACAGA AGTTTAGTGG GTGAGATTTT TATTATTTTC CTTATTCTGT TTTGTTTGTA 4620 

GGTCTAAGTC TTTTTATCAC TTTGAAAAAC TCCTATAACA TCTTTCCGAA AAACTATAAT 4 680 

TTTCTTGAAA AATATACAAG TCTATGCTAT ACTACTAGTA TACTTACTTA TGGAGAAAAT 4740 

ACATGAAACG TGAGATTTTA CTGGAACGAA TCGACAAACT AAAACAACTC ATGCCCTCGT 4 800 

AAGTTCTGGA ATACTACCAA 482 o 
(2) INFORMATION FOR SEQ ID NO: 20: 

tii SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21338 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
(D> TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CTACGACATC ATGATTAACA GTCATGCGCT ACTACCAACT GAGCTATGGC GGATAAAATA 60 
GTCCGTACGG GATTCGAACC CGTGTTACCG CCGTGAAAAG GCGGTGTCTT AACCCCTTGA 120 
CCAACGGACC TTCTATCTGT AGCAGATATA ACCATTATAT CAATTTCTTG CTAATTGTCA 180 
ATCACTTTTG AGATTTTTTC TCTAAAATAT CTTTTAATTT TCTAATTTTT AATCTTGAAA 240 
TAGGACAACG ATGGTCTTCA TAGAAAACAA TTTCTAAGTT TTTTCGATCA ATTTCTCTGA 300 
TATTACCTAT ATTTACCAAA AATGACTTGT GAGGAGAATA AAATCGCTGA GTATGTTTGT 360 
CCTTTTCCTG AATATCTGTC ATGGTACCAT AAAACTCTTT TGCAAAATTC TTACCAATAA 420 
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TGCGCAATTT ATGAGATACC CCTGTTGTTT CAATATACAA AATATCATGC TAAGGAATTT 480 

TTAAATCATT TCCCTTGTAA TTGTAGTCGA AATAATCTAC AACATCTTCA TTTTCAAGTA 540 

ACATACTCTT CGTGTAGAAG ATATTTTGCT CAATTCTCTT CTTAAACATC TCATCATTGA 600 

TATCCTTATC AACAAAATCT AGGGCTGATA CCTGGTATTT ATAGGTTAGA GTCGCAAACT 660 

CTGATCGACT AGTGATAAAG ACGATAATAG CGTAAGGATT GTAATGACGA ATGAGCTGAG 720 

CCACTTCAAA TCCCTTTTTC TCAATTCCAT GAATATCGAT ATCTAGGAAA TAAAGCTGAT 780 

TTACTTCATC ATTTTCAATG TATTCTTCAA ATTCACGGAC TTTTCCCGTT GTCTTGTATG 84 0 

ATATTGGAAT ATTCGATTCT TTCGAAATTT CATCCAATAT TCTCTCTAGT CTCACTTGAT 900 

GTTCAATAAC ATCTTCTAAA ATTAAAACTT TCATTCAAAT TCCCTCTTAA ATCTAATGAT 960 

TTGTCTAAAT GTACTGCCTT CCATCTCTGT TTCTAAAATA ATATTGTTGT ACTTATCTAG 1020 

TACTTCTTTC ACATTATTTA ATCCGACTCC GCGATTTCTT CCCTTAGTGG AGAATCCTAA 1080 

GGCAAATAGA TCTCCTGAAG GAGTCATCGT CATTTTACAT GAATTCTGAA TCACAATAAC 1140 

TGTTTCAGTT TCCATCTTAA TAACTGCTAC TTCCATCTGC TTTTTATAGC TATCAGCCGA 1200 

TCCTTCGACA GCATTATTCA ATAAAACGCT CATGATACGA ACCAAATCCA ATAGTTCAAT 1260 

TGGAAGCTTG GTAATCGTAT CTTTTACTTC CAGTGTAAAC TCTACACCAT TATTTCGAGC 1320 

ATAGACAATT GACTGAGCAA CCAAACTTCG TAAAGCTGAG TCTTCTATGT TGTTCAAATC 1330 

AAAGTAAGTG TACTTATCTG AACGCAATTT ATGATTTGCT TTGACTAAAA CTTCATTGTA 1440 

AATTCTGTCA ATTTCCTGTA AATTACCACT GTCAATTGCC ATCTGCATGC TGACAAGCAT 1500 

TCCAGCATAA TCATGTCGAA AACCACGGAT TTCATTATAC AGACCAACAA TTTCATCTGT 1560 

GTAATTCTGT AAATGTTTCT GTTCAAATTT CTTCTGCTTC AAAGCAATCT CTTTCTCCAT 1620 

TTGAACTTTA TGAGAATTCA TTGCAAAGAA GGTCAAAAGG AGAGAGATAA AGACAATAGA 1680 

TGACAAAATA CTTCCAAAAC TATTCAAATG TTTAATCGTA CTTACCATAT CTGAAACGAA 1740 

AGATACAATA TGTAGCAATA GTAAAGCAAA AAATACTTTT TTCAAGAAAG GATAAAGGTA 1800 

GTCCTTGTCA AAATAGGCTA GTTCCAAATG GAAATAGTAA ATGATTTTTA ATGTAACAAA 1860 

ATAGGTTAAC ACCGTCACAA CGAAAAAGAA TGGGAAATGA TATTGTAAAA CAAAATTATC 1920 

TCCTGTTATA GAGGAGAAAA TTACGGACAG AAAGTTATGA GTGCTCTCAT ATAAAAGAGA 1980 

TAGTAGTAAA CTTAGGAATA GTCCTCTATC CCTCTCATAC TGTTTCATCC ATCGAAAATA 2040 

GGAATATAAG CCCAAAGGAA ATAAAAATCT TTCAATCCCT ATTTTATCTA AATATAGAAG 2100 

ATAAAAGGAA AATTCAAGTA CTATTTCAGT TAGTAATGTA TAAGCACCAA AAACGTATAA 2160 

TTCTTTTCTA TTTATTCGAC CTTTACAAAT TAAACGGTAA CTGTGACTAA TAATTAAAAA 2220 
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ATGAACAATA ACTGTCCCAA ATCCAAGTAA ATCCATTACT CTTTCTCCTT ATTTCATTAC 2280 

TTTTTTCGTA GGAAAAGAAA ATCAAGGATG ATTCTTGAAA TCCTCATCTC CCCACCTTTA 2340 

ATCTTTTGTA AGTCTTTTTC CTTCAAAGCT ACAAACTGTT CCAATTTAAC TGTGTTTTTC 2400 

ATAATAAAAT CTCCTAAAAT GTTTTTTCTT GTAAGCTAAC TTACAAAAAC CATTATACAA 2460 

AATGGAATTT CGTTTTAGAT AAAATTCTCT CAACTGTCAT TTTTTTCTCC CAAAGTGTAC 2520 

TTTTTTAAGA AAAAAGCCGG GAAAATTCCC AGCTTTGCTA TTATATTGAT CCCAGCAGGA 2580 

TTCGAACCTG CGACCGTTCG CTTAGAAGGC GAATGCTCTA TCCAGCTGAG CTATGAGACC 2640 

TAATACAATT ATTCTACCAA AAATTCAATT AAAAGTCAAT TTTCTATTTA TGGTAGGGGA 2700 

ATCCCTGCTG AATCGTAAAA GCGCGATAGA TTTGTTCAAC AAGAACTAGT CTCATTAACT 2760 

GATGGGGTAA GGTTAGGCGA CCAAAACTGA CAGAAAGATT GGCTCTATTT TTTACAGATG 2820 

ATGATAATCC TAAACTTCCC CCAATAATAA AAGTAAGAGT AGAAAATCCT TTTATAGAAG 2880 

TTTCTTCTAA CTGCTTACTA AATTCTTCTG AGAAGAAAGT TTTCCCTTCA ATGGCTAACA 2940 

CAATAACGAA ATCACCCTCA GCAATTTTTG ATAAAATTCT CTGACCTTCT ATTTCTAAAA 3000 

TCTTTTGATT TTCTGATTCA CTGGCCTTAT CTGGTGTTTT TTCATCTGAT AACTCAATCA 3060 

TTTCAAACTT AGCAAATCTA GAAATTCGTT TTGAATACTC TGCGATACCA TCTTTTAAAT 3120 

ACTTTTCTTT CAGTTTCCCA ACTGTTACAA CTTTAATTTT CATGACTCTA TTCTAACATA 3180 

TTCTCTATTT TTTCACATCT TATTCACAAA ATAAAAAATA GATTTCAATT AAGAAAATCA 3240 

CAATTTCAAA AGAGTTATCC ACAGTTTGTG TAAAACTTTT GTGTTTAAGT TATAATTAAG 3300 

CTACTCAGTT TATACTTTCA GTAATTCAAA CATATGGAGG CAAATATGAA ACATCTAAAA 3 360 

ACATTTTACA AAAAATGGTT TCAATTATTA GTCGTTATCG TCATTAGCTT TTTTAGTGGA 3420 

GCCTTGGGTA GTTTTTCAAT AACTCAACTA ACTCAAAAAA CTAGTCTAAA CAACTCTAAC 3480 

AACAAT ACTA CTATTACACA AACTGCCTAT AAGAACGAAA ATTCAACAAC ACAGCCTGTT 3540 

AACAAAGTAA AAGATGCTGT TGTTTCTGTT ATTACTTATT CGGCAAACAG ACAAAATAGC 3600 

GTATTTGGCA ATGATGATAC TGACACAGAT TCTCAGCGAA TCTCTAGTGA AGGATCTGGA 3660 

GTTATTTATA AAAAGAATGA TAAAGAAGCT TACATCGTCA CCAACAATCA CGTTATTAAT 3720 

GGCGCCAgCA AAGTAGATAT TCGATTGTCA GATGGGACTA AAGTACCTGG AGAAATTGTC 3780 

GGAGCTGACA CTTTCTCTGA TATTGCTGTC GTCAAAATCT CTTCAGAAAA AGTGACAACA 3840 

GTAGCTGAGT TTGGTGATTC TAGTAAGTTA ACTGTAGGAG AAACTGCTAT TGCCATCGGT 3900 

AGCCCGTTAG GTTCTGAATA TGCAAATACT GTCACTCAAG GTATCGTATC CAGTCTCAAT 3960 
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AGAAATGTAT CCTTAAAATC GGAAGATGGA CAAGCTATTT CTACAAAAGC CATCCAAACT 4 020 

GATACTGCTA TTAACCCAGG TAACTCTGCC GGCCCACTGA TCAATATTCA AGGGCAGGTT 4080 

ATCGGAATTA CCTCAAGTAA AATTGCTACA AATGGAGGAA CATCTGTAGA AGGTCTTGGT 4140 

TTCGCAATTC CTCCAAATGA TGCTATCAAT ATTATTGAAC AGTTAGAAAA AAACGGAAAA 4 200 

GTGACGCGTC CAGCTTTGGG AATCCAGATG GTTAATTTAT CTAATGTGAG TACAAGCGAC 4 260 

ATCAGAAGAC TCAATATTCC AAGTAATGTT ACATCTGGTG TAATTGTTCG TTCGGTACAA 4 320 

AGTAATATCC CTGCCAATGG TCACCTTGAA AAATACGATG TAATTACAAA AGTAGATGAC 4380 

AAAGAGATTG CTTCATCAAC AGACTTACAA AGTGCTCTTT ACAACCATTC TATCGGAGAC 4440 

ACCATTAAGA TAACCTACTA TCGTAACGGG AAAGAAGAAA CTACCTCTAT CAAACTTAAC 4500 

AAGAGTTCAG GTGATTTAGA ATCTTAATTG ACATCTATCT AAAGAAAGCT TTACATAAGA 4560 

GAAAAGATGT GTTACTCTAG AATCATGGAA AAATTTGAAA TGATTTCTAT CACAGATATA 4620 

CAAAAAAATC CCTATCAACC CCGAAAAGAA TTTGATAGAG AAAAACTAGA TGAACTAGCA 4 680 

CAGTCTATCA AAGAAAATGG GGTCATTCAA CCGATTATTG TTCCTCAATC TCCTGTTATT 4740 

GGTTATGAAA TCcTTGCAGG AGAGAGACGC TATCGGGCTT CACTTTTACC TGGTCTACGG 4800 

TCTATCCCAG CTGTTGTTAA ACAGATTTCA GACCAAGAGA TGATGGTCCA GTCCATTATT 4 860 

GAAAATTTAC AGAGAGAAAA TTTAAACCCA ATAGAAGAAG CACGCGCCTA TGAATCTCTC 4 920 

GTAGAGAAAG GATTCACCCA TGCTGAAATT GCAGATAAGA TGGGCAAGTC TCGTCCATAT 4980 

ATCAGCAACT CCATTCGTTT ACTTTCCTTG CCAGAACAGA TTCTTTCAGA AGTAGAAAAT 5040 

GGCAAACTAT CACAAGCCCA TGCGCGTTCC CTAGTTGGGT TAAATAACGA ACAACAAGAC 5100 

TATTTCTTTC AACGGATTAT AGAAGAAGAT ATTTCTGTAA GGAAATTAGA AGCTCTTCTG 5160 

ACAGAGAAAA AACAAAAGAA ACAGCAAAAA ACTAATCATT TCATACAAAA TGAAGAAAAA 5220 

CAGTTAAGAA AACTACTCGG ATTAGATGTA GAAATTAAAC TATCTAAAAA AGACAGTGGA 5280 

AAAATCATTA TTTCTTTTTC AAATCAAGAA GAATATAGTA GAATTATCAA CAGCCTGAAA 5340 

TAAGGCTGTT CTTTTATTTT TTTATCTCAC AAGGTTATCC ACTATGTTTT TCGATAAAAA 5400 

GCTTAATAAA TCAATAATTT CTTCTTTTAT CCCCAACCTG TGGATAAAGT TTGGTAACAT 54 60 

TGTGGATTAT TTTTCACAGC TTGTGGAAAA TTCTTGCTAT CTATGGTAAA ATATCTCTAG 5520 

TATTAAACTT TTAAATAGTA AAGGAGGAGA AAGGATTGAA AGAAAAACAA TTTTGGAATC 5580 

GTATATTAGA ATTTGCACAA GAAAGACTGA CTCGATCCAT GTATGATTTC TATGCTATTC 5640 

AAGCTGAACT CATCAAGGTA GAGGAAAATG TTGCCACTAT ATTTCTACCT CGCTCTGAAA 5700 

TGGAAATGGT CTGGGAAAAA CAACTAAAAG ATATTATTGT AGTAGCTGGT TTTGAAATTT 5760 
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ATGACGCTGA AATAACTCCC CACTATATTT TCACCAAACC TCAAGATACG ACTAGCTCAC 5820 

AAGTTGAAGA AGCTACAAAT TTAACTCTTT ATAACTATAG TCCAAAGTTA GTATCTATTC 5880 

CTTATTCAGA TACGGGATTA AAAGAAAAGT ATACCTTTGA TAACTTTATT CAAGGGGATG 5940 

GAAATGTTTG GGCTGTATCA GCCGCTTTAG CTGTCTCTGA AGATTTGGCT CTGACCTATA 6000 

ACCCTCTTTT TATCTATGGA GGACCAGGCC TTGGTAAGAC TCACTTATTA AACGCTATTG 6060 

GAAATGAAAT TCTAAAAAAT ATTCCTAATG CGCGTGTTAA ATATATCCCT GCCGAAAGCT 6120 

TTATTAATGA CTTTCTTGAT CACCTAAGAC TTGGGGAAAT GGAAAAGTTT AAAAAGACCT 6180 

ATCGTAGTCT TGATCTTTTG TTAATCGATG ATATCCAGTC ACTCAGCGGA AAAAAAGTCG 6240 

CAACTCAGGA AGAATTTTTC AATACCTTTA ACGCCCTTCA TGACAAGCAA AAACAGATTG 6300 

TCCTAACGAG TGATCGTAGT CCAAAACATC TAGAAGGGCT CGAGGAGAGG CTTCTCACGC 6360 

GTTTTAGTTG GGGATTGACA CAAACTATCA CCCCCCCTGA CTTTGAAACA CGTATTGCCA 6420 

TTTTACAAAG TAAGACGGAA CATTTAGGCT ACAATTTCCA AAGTGATACT CTAGAATACC 6480 

TAGCTGGGCA ATTTGATTCA AATGTTCGAG ATCTTGAGGG AGCCATCAAC GACATCACTT 6540 

TAATTGCCAG AGTAAAAAAA ATCAAGGATA TCACTATTGA TATTGCTGCA GAAGCCATTA 6600 

GAGCCCGCAA ACAAGATGTT AGCCAAATGC TCGTCATCCC AATTGATAAA ATCCAAACTG 6660 

AAGTTGCTAA CTTTTATGGT GTTAGTATCA AAGAAATGAA GGGAAGTAGA CGCCTTCAAA 6720 

ATATTGTTTT GGCCCGTCAA GTAGCCATGT ATTTATCTAG AGAACTAACA GATAATAGTC 6780 

TTCCAAAAAT TGGGAAGGAA TTTGCGGGAA AACATCATAC CACAGTCATT CATCCCCATG 6840 

CCAAAATAAA ATCTTTCATT GATCAAGACG ATAATTTACC TTTAGAAATT GAATCAATCA 6900 

AAAAGAAAAT CAAATAATTT GTGGATAACT TTTAGTTTTT TATCTTTTTT ATCCACATTT 6960 

TTTAAACAAG CTAAAAAACT TGATATGACT TGTTTAAAGG CTGTTTTCCA CAGATTTCAC 7020 

AGACTCTATT ATTACTATTA TCTTTCTAAT ACTAAAAATA AATAAAGGAG AATCCATGAT 7080 

TCATTTTTCA ATTAATAAAA ATTTATTTCT ACAAGCATTA AATACTACTA AGAGAGCTAT 7140 

TAGTTCTAAA AATGCCATTC CTATTTTATC AACACTAAAA ATTGACGTGA CCAATGAAGG 72 00 

TATTACTTTA ATTGGTTCAA ATGGTCAAAT TTCAATTGAA AATTTTATTT CTCAAAAAAA 72 60 

TGAAGATGCT GGTTTGTTAA TTACTTCTTT AGGTTCGATC CTTCTTGAAG Cri V I'ETC'l T 7320 

TATCAATGTA GTATCTAGTT TACCTGATGT AACTCTTGAT TTTAAAGAAA TTGAACAAAA 7380 

TCAAATTGTT TTAACCAGTG GCAAATCAGA AATTACCCTA AAAGGAAAAG ATAGCGAACA 7440 

ATATCCACGA ATCCAAGAAA TTTCAGCAAG CACTCCTTTA ATACTTGAAA CAAAATTACT 7500 
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CAAGAAAATT ATTAATGAAA CAGCCTTTGC TGCAAGTACA CAAGAGAGTC GTCCGATTTT 7560 

AACAGGTGTC CACTTCGTAT TGAGTCAACA CAAAGAGTTA AAAACAGTTG CAACAGACTC 7 620 

TCATCGCCTA AGCCAGAAAA AATTGACTCT TGAAAAAAAT AGTGATGATT TTGATGTCGT 7 680 

AATTCCTAGC CGTTCTCTAC GCGAATTTTC AGCGGTATTT ACAGATGATA TCGAAACTGT 7740 

AGAGATTTTC TTTGCCAATA ACCAAATCCT CTTTAGAAGC GAAAATATTA GCTTCTATAC 7800 

TCGTCTCCTA GAAGGAAACT ATCCTGATAC AGATCGCTTG ATTCCAACAG ACTTTAACAC 7860 

TACT ATT ACT TTTAATGTGG TAAACTTACG CCAGTCAATG GAGCGTGCCC GTCTTTTATC 792 0 

AAGTGCGACT CAAAATGGTA CTGTGAAACT TGAAATTAAG GATGGGGTTG TTAGCGCCCA 7980 

TGTTCACTCT CCAGAAGTTG GTAAAGTAAA CGAAGAAATC GATACTGATC AGGTTACTGG 8040 

TGAAGATTTG ACCATTAGTT TCAACCCAAC TTACTTGATT GATTCTCTTA AAGCTTTAAA 8100 

TAGCGAAAAG GTGACTATTA GCTTTATCTC AGCTGTTCGT CCATTTACTC TTGTGCCAGC 8160 

AGATACTGAC GAAGACTTCA TGCAGCTCAT TACACCAGTT CGTACAAATT AAGTGAAAGA 8220 

CGTTGAGCCT GGCTCGCCTC TTTTATGATA TAATCGAAAA AGAAAAGGAG AGTAGTATGT 8280 

ATCAAGTTGG AAATTTTGTT GAGATGAAAA AATCACACGC TTGTACAATC AAGTCGACTG 8340 

GTAAAAAGGC TAATCGTTGG GAAATTACAC GTGTAGGAGC AGATATCAAA ATAAAATGTA 84 00 

GTAATTGTGA GCATGTTGTC ATGATGGGGC GATATGATTT TGAGCGAAAA ATGAATAAAA 84 60 

TTATTGACTG AGAACCCTTA GTTAGAGGGT TAGCACTTTA TCCCTTTTTG TGTTATAATA 8520 

TTAGGGATTG AAATGAAAAC GGAGAATGAG AAATATGGCT TTGACAGCAG GTATCGTTGG 8580 

TTTGCCAAAC GTTGGTAAAT CAACACTATT TAATGCAATT ACAAAACCAG GAGCAGAGGC 8 640 

AGCAAACTAC CCATTTGCGA CGATTGATCC AAATGTTGGA ATGGTGGAAG TTCCAGATGA 8700 

ACGCCTACAA AAACTAACTG AAATGATAAC TCCTAAAAAG ACAGTTCCCA CAACATTTGA 87 60 

ATTTACAGAT ATTGCAGGGA TTGTAAAAGG AGCTTCAAAA GCAGAGGGCC TAGGGAATAA 8820 

ATTCTTCGCC AATATTCGTG AAGTAGATGC GATTGTTCAC GTAGTTCGTG CTTTTGATGA 8880 

TGAAAATGTA ATGCGCGAGC AAGGACGTGA AGACGCCTTT GTAGATCCAC TTGCAGATAT 8940 

TGATACCATT AATCTGGAAT TGATTCTTGC TGACTTAGAA TCAGTGAACA AACGATATGC 9000 

GCGTGTAGAA AAGATCGCAC GTACGCAAAA AGATAAAGAA TCAGTAGCAG AATTCAATGT 9060 

TCTTCAAAAG ATTAAACCAG TCCTAGAAGA CGGGAAATCA GCTCGTACCA TTGAATTTAC 9120 

AGATGAGGAA CAAAAGGTTG TCAAAGGTCT TTTCCTTTTG ACGACTAAAC CAGTTCTTTA 9180 

TGTAGCTAAT GTGGACGAGG ATGTGGTTTC AGAACCTGAC TCTATCGACT ATGTCAAACA 92 40 

AATTCGTGAA TTTGCAGCGA CAGAAAATGC TGAAGTAGTC GTTATTTCTG CGCGTGCTGA 9300 
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GGAAGAAATT TCTGAATTGA ATGATGAAGA TAAAAAAGAG TTTCTTGAAG CCATTGGTTT 9 360 

GACAGAATCA GGTGTAGATA AGTTGACGCG TGCAGCTTAC CACTTGCTTG GATTGGGAAC 9420 

TTACTTCACA GCTGGTGAAA AAGAAGTTCG CGCTTGGACT TTCAAACGTG GTATGAAGGC 9480 

TCCTCAAGCA GCTGGTATTA TCCACTCAGA CTTTGAAAAA GGCTTTATTC GTGCAGTAAC 9540 

CATGTCATAT GAAGATCTAG TGAAATACGG ATCTGAAAAG GCCGTAAAAG AAGCTGGACG 9600 

CTTGCGTGAA GAAGGAAAAG AATATATCGT TCAAGATGGC GATATCATCG AATTCCGCTT 9660 

TAATGTCTAA AAATTAATAA ATGGTGTCAA TTAGGTTGGA AAAAAATTCC AACCCTTTTG 9720 

GCTTTTGAAA GGAAAAATAA ATGACCAAAT TACTTGTAGG CTTGGGAAAT CCAGGGGATA 9780 

AATATTTTGA AACAAAACAC AATGTTGGTT TTATGTTGAT TGATCAACTA GCGAAGAAAC 9840 

AGAATGTCAC TTTTACACAC GATAAGATAT TTCAAGCTGA CCTAGCATCC TTTTTCCTAA 9900 

ATGGAGAAAA AATTTATCTG GTTAAACCAA CGACCTTTAT GAATGAAAGT GGAAAAGCAG 99 60 

« TTCATGCTTT ATTAACTTAC TATGGTTTGG ATATTGACGA TTTACTTATC ATTTACGATG 1O020 

ATCTTGACAT GGAAGTTGGG AAAATTCGTT TAAGAGCAAA AGGCTCAGCA GGTGGTCATA 10080 

ATGGTATCAA GTCTATTATT CAACATATAG GAACTCAGGT CTTTAACCGT GTTAAGATTG 10140 

GAATTGGAAG ACCTAAAAAT GGTATGTCAG TTGTTCATCA TGTTTTGAGT AAGTTTGACA 10200 

GGGATGATTA TATCGGTATT TTACAGTCTG TTGACAAAGT TGACGATTCT GTAAACTACT 10260 

ATTTACAAGA GAAAAATTTT GAGAAAACAA TGCAGAGGTA TAACGGATAA ATGGTGACCT 10320 

T ATT AG ATT T ATTCTCAGAA AATGATCAGA TTAAAAAATG GCATCAAAAT TTAACAGATA 10380 

AGAAAAGACA ACTAATACTT GGTTTATCAA CATCTACTAA GGCTCTTGCA ATTGCAAGCA 10440 

GTTTAGAAAA AGAAGATAGG ATTGTGTTAT TGACGTCAAC TTATGGAGAA GCAGAAGGAC 10500 

TTGTTAGTGA TCTTATTTCT ATCTTGGGTG AGGAACTCGT CTATCCATTT TTGGTAGATG 105 60 

ATGCTCCTAT GGTGGAGTTT TTGATGTCTT CACAGGAAAA AATTATTTCA CGGGTTGAAG 10620 

CCTTGCGTTT TTTGACTGAT TCATCTAAGA AAGGGATTTT AGTTTGTAAT ATCGCAGCAA 10680 

GTCGATTGAT TTTACCGTCT CCCAATGCAT TCAAAGATAG TATTGTAAAA ATCTCAGTTG 10740 

GTGAAGAATA TGATCAACAC GCGTTTATCC ATCAGTTAAA GGAAAATGGC TATCGAAAAG 10800 

TTACTCAAGT ACAAACTCAG GGCGAATTTA GTCTTCGAGG AGATATTTTA GATATTTTTG 10860 

AAATATCCCA GTTAGAACCT TGTCGAATTG AGTTTTTTGG TGATGAAATT GATGGTATCA 10920 

GGTCATTTGA AGTAGAAACA CAATTATCGA AAGAAAATAA GACAGAACTC ACTATCTTTC 10980 

CAGCTAGTGA TATGCTTTTG AGAGAAAAGG ATTATCAACG AGGACAGTCA GCTTTAGAAA 11040 



WO 98/18931 



PCTAJS97/19588 



264 



AACAAATTTC 


AAAAACTTTA 


TCACCTATTT 


TGAAATCATA 


CCTAGAAGAA 


ATTCTTTCAA 


11100 


C TTTTC ACC A 


AAAACAAAGT 


CATGCAGACT 


CTCGGAAGTT 


TTTATCTTTG 


TCCTATGATA 


11160 


AGACATGGAC 


TftTf*TTTfiAT 

4 VJ 4^444 vin • 


TATATTGAAA 


AAGATACTCC 


AATATTCTTT 


GATGATTATC 


11220 


AAAAATTdAT 


GAATCAGTAT 


GAAGTCTTTG 


AAAGAGACTT 


AGCGCAGTAC 


TTTACAGAAG 


11280 


AATTAPAfiAA 


TAfTT AAAflPA 


TTTTPTG ATA 

4 4 4 i \>l\JAir* 


TGCAGTATTT 


TTCTGATATT 


GAACAAATCT 


11340 


ATAAAAAATA 


AAfiTPCAGTG 


A f^CTT TTT CT 

• •w 4 4 A A • * 


CTAATCTTCA 


AAAGGGTTTA 


GGAAATCTCA 


11400 


AATTTflACAA 


AATTT AT C AA 


TTCAATCAAT 


ATCCTATGCA 


GGAATTTTTC 


AATCAGTTTT 


11460 


1.1 1 1 1U1 nnn 


AHA Afi AA ATT 


GAACGATATA 


AAAAAATGGA 


TTACACCATT 


ATTCTGCAGT 


11520 






AAAACATTGG 


AGGATATGTT 


AGAGGAATAT 


CAGATTAAAT 


11580 




nun 1 f\/\\*jr\s~r\ 


AATATPTGTA 


AAGAATCTGT 


AAACTTAATA 


GAGGGTAATC 


11640 


TCAGACATOG 


111 ILA1 ill 


f*T Afl A1Y3 AAA 


ifZ ATT TT ATT 

f\\jr> 4 1 4 a n 4 a 


GATAACTGAA 


CATGAGATTT 


11700 


ITLAAAAbAA 


AI !AAAvAA»l 


VrV lilt %»w**vf\ 


GACAACATGT 


TTCAAATGCA 


GAGAGATTAA 


11760 


aaua rr AU\n 




AAAflGGGACT 


ATGTTGTCCA 


TCATATCCAT 


GGGATTGGTC 


11820 


AATATL1 ACjO 


& Ai"iv A a a*"^ 


ATTflA AATPA 


Af^^flAATTCA 


TCGCGATTAT 


GTCAGTGTCC 


11880 


AATACCAAAA 


*Tf*r*TC A r Pf' > & & 




prnTtinAACA 


GATTCATCTA 


CTGTCCAAAT 


11940 


ATA.TTTV.AAtj 


ItiAluOlAAA 


fZC^CC* AAAAP 


TCAATAAATT 

4 ^«w% 4 4 4 


AAATGACGGT 


CATTTTAAAA 


12000 


AvjVjCv. AAbL A 


AAAIA* 1 nnw 


AAf*rAGGTAG 


AGGATATAGC 


TGATGATTTA 


ATCAAACTCT 


12060 


ACTCTviAAv-vj 




A AflfyiTTTTf! 


PTTTPTC* AGC 


TGATGATGAT 


GATCAAGATG 


12120 




111 Wl-t- I 


TATGTTGAAA 


CGGATGATCA 


ACTTCGTAGT 


ATTGAGGAAA 


12180 




1 n 1 V*L./\UV)V> 1 


TPTPACrrAA 

4^4 CAU^nn 


TGGATCGACT 


TTTAGTTGGG 

AAA- • • 


GATGTTGGTT 


12240 




lunnu 1 * 1 


ATHPtVPGCAG 

X UvVj 4 vK»nw 


CCTTTAAAGC 


AGTCAATCAT 


CACAAACAGG 


12300 


TTf^*P^ & H V' 




ACGGTTTTAG 


CGCAACAGCA 


CTATACGAAT 


TTTAAGGAAC 


12360 


GATTCCAAAA 


TTTTGCAGTT 


AATATTGATG 


TGTTGAGTCG 


CTTTAGAAGT 


AAAAAAGAGC 


12420 


AGACTCCAAC 


ACTTGAAAAA 


TTGAAAAACG 


GTCAAGTCGA 


TATTTTGATT 


GGAACACATC 


12480 


GTGTTTTGTC 


AAAAGA'itrrr 


GTGTTTGCTG 


ATTTGGGCTT 


GATGATTATT 


GATGAGGAAC 


12540 


AGCGATTTGG 


TGTCAAGCAT 


AAGGAAACTT 


TGAAAGAACT 


GAAGAAACAA 


GTGGATGTCC 


12600 


TAACCTTGAC 


CGCTACGCCA 


ATCCCTCGTA 


CCCTCCATAT 


GTCTATGCTG 


GGAATCAGAG 


12660 


ATTTATCTGT 


TATTGAAACT 


CCGCCGACTA 


ATCGCTATCC 


TGTTCAGACC 


TATGTTTTGG 


12720 


AAAAGAATGA 


TAGTGTCATT 


CGTGATGCTG 


TCTTGCGTGA 


AATGGAGCGT 


GGAGGTCAAG 


12780 


TTTATTATCT 


TTACAACAAA 


GTTGACACAA 


TTGTTCAGAA 


GGTTTCAGAA 


TTACAGGACT 


12840 



WO 98/1S93I 



PCT/US97/19588 



265 

TGATTCCGGA GGCTTCGATT GGATATGTTC ATGGTCGAAT GAGTCAAGTC CAGTTGGAAA 12 900 

ATACTCTATT AGACTTTATT GAGGGACAAT ACGATATCTT GGTGACGACT ACTATTATTG 12960 

AGACAGGGGT GGACATTCCA AATGCTAATA CTTTATTTAT TGAAAATGCG GACCATATGC 13020 

GCTTGTCAAC CTTATATCAG TTAAGAGGAA GAGTCGGTCG TAGTAATCGT ATTGCTTATG 13080 

CTTATCTCAT GTATCGTCCA GAAAAATCAA TCAGTGAAGT CTCTGAAAAG AGATTAGAAG 13140 

CGATTAAAGG ATTTACAGAA TTGGGCTCTG GCTTTAAGAT TGCAATGCGA GATCTTTCGA 13200 

TTCGTGGAGC AGGAAATCTT TTAGGAAAAT CCCAGTCTGG TTTCATTGAT TCTGTTGGTT 13260 

TTGAATTGTA TTCGCAGTTA TTACAGGAAG CTATTGCTAA ACGAAACGGT AATGCTAACG 13 320 

CTAACACAAG AACCAAAGGG AATGCTGAGT TGATTTTGCA AATTGATGCC TATCTTCCTG 13380 

ATACTTATAT TTCTCATCAA CGACATAAGA TTGAAATTTA CAAGAAAATT CGTCAAATTG 13440 

ACAACCGTGT CAATTATGAA GAGTTACAAG AGCAGTTGAT AGACCGTTTT GGAGAATACC 13500 

\ CAGATGTAGT AGCCTATCTG TTAGAGATTG CTTTCGTCAA ATCATACTTG GACAAGGTCT 13560 

TTGTTCAACG TGTGGAAAGA AAAGATAATA AAATTACAAT TCAATTTGAA AAAGTCACTC 13620 

AACGACTGTT TTTAGCTCAA GATTATTTTA AAGCTTTATC CGTAACGAAC TTAAAAGCAG 13680 

GCATCGCTGA GAATAAGGGA TTAATGGAGC TTGTATTTGA TGTCCAAAAT AAGAAAGATT 13740 

ATGAAATTTT AGAAGCTTTG CTGATTTTTG GAGAAAGTTT ATTAGAGATA AAAGAGTCTA 13800 

AGGAAGAAAA TTCCATTTGA TATTTTTCTT CTATAAAATA GATAAAAATG GTACAATAAT 13860 

AAATTGAGGT AATAAGGATG AGATTAGATA AATATTTAAA AGTATCGCGA ATTATCAACC 13920 

», GTCGTACAGT CGCAAAGGAA GTACCACATA AAGGTAGAAT CAAGGTTAAT GGAATCTTGG 13 980 

CCAAAACTTC AACGGACTTG AAAGTTAATG ACCAAGTTGA AATTCGCTTT GGCAATAAGT 14 040 

TGCTGCTTGT AAAAGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGAT GCAGCAGGAA 14100 

TGTATGAAAT TATCAGTGAA ACACGCGTAG AAGAAAATGT CTAAAAATAT TGTACAATTG 14160 

AATAATTCTT TTATTCAAAA TGAATACCAA CGTCGTCGCT ACCTGATGAA AGAACGACAA 14 220 

AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA TGCTATTATT TATCTTGCCA 14280 

ACTTTTAATT TAGCGCAGAG TTATCAGCAA TTACTCCAAA GACGTCAGCA ATTAGCAGAC 14340 

TTGCAAACTC AGTATCAAAC TTTGAGTGAT GAAAAGGATA AGGAGACAGC ATTTGCTACC 14400 

AAGTTGAAAG ATGAAGATTA TGCTGCTAAA TATACACGAG CGAAGTACTA TTATTCTAAG 14460 

TCGAGGGAAA AAGTTTATAC GATTCCTGAC TTGCTTCAAA GGTGATAAAA TGGAAAATTT 14520 

ATTAGACGTA ATAGAGCAAT TTTTGACTTT GTCAGATGAA AAGCTGGAAG AATTGGCTGA 14580 
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TAAAAATCAA TTATTGCGTT TACAAGAAGA AAAGGAAAGG AAGAATGCGT AAATTCTTAA 14640 

TTATTTTGTT GCTACCAAGT TTTTTGACCA TTTCAAAAGT CGTTAGCACA GAAAAAGAAG 14700 

TCGTCTATAC TTCGAAAGAA ATTTATTACC TTTCACAATC TGACTTTGGT ATTTATTTTA 14760 

GAGAAAAATT AAGTTCTCCC ATGGTTTATG GAGAGGTTCC TGTTTATGCG AATGAAGATT 14 820 

TAGTAGTGGA ATCTGGGAAA TTGACTCCCA AAACAAGTTT TCAAATAACC GAGTGGCGCT 14880 

TAAATAAACA AGGAATTCCA GTATTTAAGC TATCAAATCA TCAATTTATA GCTGCGGACA 14 940 

AACGATTTTT ATATGATCAA TCAGAGGTAA CTCCAACAAT AAAAAAAGTA TGGTTAGAAT 15000 

CTGACTTTAA ACTGTACAAT AGTCCTTATG ATTTAAAAGA AGTGAAATCA TCCTTATCAG 15060 

CTTATTCGCA AGTATCAATC GACAAGACCA TGTTTGTAGA AGGAAGAGAA TTTCTACATA 15120 

TTGATCAGGC TGGATGGGTA GCTAAAGAAT CAACTTCTGA AGAAGATAAT CGGATGAGTA 15180 

AAGTTCAAGA AATGTTATCT GAAAAATATC AGAAAGATTC TTTCTCTATT TATGTTAAGC 15240 

AACTGACTAC TGGAAAAGAA GCTGGTATCA ATCAAGATGA AAAGATGTAT GCAGCCAGCG 15300 

TTTTGAAACT CTCTTATCTC TATTATACGC AAGAAAAAAT AAATGAGGGT CTTTATCAGT 15360 

TAGATACGAC TGTAAAATAC GTATCTGCAG TCAATCATTT TCCAGGTTCT TATAAACCAG 15420 

AGGGAAGTGG TAGTCTTCCT AAAAAAGAAG ATAATAAAGA ATATTCTTTA AAGGATTTAA 15480 

TTACGAAAGT ATCAAAAGAA TCTGATAATG TAGCTCATAA TCTATTGGGA TATTACATTT 15540 

CAAACCAATC TGATGCCACA TTCAAATCCA ACATGTCTGC CATTATCGCA GATGATTGGG 15600 

ATCCAAAAGA AAAATTGATT TCTTCTAAGA TGGCCGGGAA GTTTATGGAA GCTATTTATA 15660 

ATCAAAATGG ATTTGTGCTA GAGTCTTTCA CTAAAACAGA TTTTGATAGT CAGCGAATTG 15720 

CCAAAGGTGT TTCTGTTAAA GTAGCTCATA AAATTGGAGA TGCGGATGAA TTTAAGCATG 15780 

ATACGGGTGT TGTCTATGCA GATTCTCCAT TTATTCTTTC TATTTTCACT AAGAATTCTG 15840 

ATTATGATAC GATTTCTAAG ATAGCCAAGG ATGTTTATGA GGTTCTAAAA TGAGGGAACC 15900 

AGATTTTTTA AATCATTTTC TCAAGAAGGG ATATTTCAAA AAGCATGCTA AGGCGGTTCT 15960 

AGCTCTTTCT GGTGGATTAG ATTCCATGTT TCTATTTAAG GTATTGTCTA CTTATCAAAA 16020 

AGAGTTAGAG ATTGAATTGA TTCTAGCTCA TGTGAATCAT AAGCAGAGAA TTGAATCAG A 16080 

TTGGGAAGAA AAGGAATTAA GGAAGTTGGC TGCTGAAGCA GAGCTTCCTA TTTATATCAG 16140 

CAATTTTTCA GGAGAATTTT CAGAAGCGCG TGCACGAAAT TTTCGTTATC ATTTTTTTCA 16200 

AGAGGTCATG AAAAAGACAG GTGCGACAGC TTTAGTCACT GCCCACCATG CTGATGATCA 162 60 

GGTGGAAACG ATTTTTATGC GCTTGATTCG AGGAACTCGC TTGCGCTATC TATCAGGAAT 163 20 

TAAGGAGAAG CAAGTAGTCG GAGAGATAGA AATCATTCGT CCCTTCTTGC ATTTTCAGAA 163 80 
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AAAAGACTTT CCATCAATTT TTCACTTTGA AGATACATCA AATCACGAGA ATCATTATTT 16440 

TCGAAATCGT ATTCGAAATT CTTACTTACC AGAATTGGAA AAAGAAAATC CTCGATTTAG 16500 

GGATGCAATC TTAGGCATTG GCAATGAAAT TTTAGATTAT GATTTGGCAA TAGCTGAATT 16560 

ATCTAACAAT ATTAATGTGG AAGATTTACA GCAGTTATTT TCTTACTCTG AGTCTACACA 16620 

AAGAGTTTTA CTTCAAACTT ATCTGAATCG TTTTCCAGAT TTGAATCTTA. CAAAAGCTCA 16680 

GTTTGCTGAA GTTCAGCAGA TTTTAAAATC TAAAAGCCAG TATCGTCATC CGATTAAAAA 16740 

TGGCTATGAA TTGATAAAAG AGTACCAACA GTTTCAGATT TGTAAAATCA GTCCGCAGgC 16800 

TGATGAAAAG GAAGATGAAC TTGTGTTACA CTATCAAAAT CAGCTAGCTT ATCAAGGATA 16860 

TTTATTTTCT TTTGGACTTC CATTAGAAGG TGAATTAATT CAACAAATAC CTGTTTCACG 16920 

TGAAACATCC ATACACATTC GTCATCGAAA AACAGGAGAT GTTTTGATTA AAAATGGCCA 16980 

TAGAAAAAAA CTCAGACGTT TATTTATTGA TTTGAAAATC CCTATGGAAA AGAGAAACTC 17040 

TGCTCTTATT ATTGAGCAAT TTGGTGAAAT TGTCTCAATT TTGGGAATTG CGACCAATAA 17100 

TTTGAGTAAA AAAACGAAAA ATGATATAAT GAACACTGTA CTTTATATAG AAAAAATAGA 17160 

TAGGTAAAAA ATGTTAGAAA ACGATATTAA AAAAGTCCTC GTTTCACACG ATGAAATTAC 17220 

AGAAGCAGCT AAAAAACTAC GTGCTCAATT AACTAAAGAC TATGCACGAA AAAATCCAAT 17280 

CTTAGTTGGG ATTTTAAAAG GATCTATTCC TTTTATGGCT GAATTGGTCA AACATATTGA 17340 

TACACATATT GAAATGGACT TCATGATGCT TTCTAGCTAC CATGGTGGAA CAGCAAGTAC 17400 

TGGTGTTATC AATATTAAAC AAGATCTGAC TCAAGATATC AAAGGAAGAC ATGTTCTATT 17460 

TGTAGAAGAT ATCATTCATA CAGGTCAAAC TTTGAAGAAT TTGCGAGATA TGTTTAAAGA 17520 

AAGAGAAGCA GCTTCTGTTA AAATTGCAAC CTTCTTCGAT AAACCAGAAG GACGTGTTGT 17580 

AGAAATTGAG GCAGACTATA CTTGCTTTAC TATCCCAAAT GAGTTTGTAG TAGGTTATGG 17640 

TTTAGACTAC AAAGAAAATT ATCGTAATCT TCCTTATATT GGAGTATTGA AAGAGGAAGT 17700 

GTATTCAAAT TAGAAAGAAT AATCTTTAAT GAAAAAACAA AATAATGGTT TAATTAAAAA 17760 

TCCTTTTCTA TGGTTATTAT TTATCTTTTT CCTTGTGACA GGATTCCAGT ATTTCTATTC 17820 

rGGGAATAAC TCACGAGGAA GTCAGCAAAT CAACTATACT GACTTGGTAC AAGAAATTAC 17880 

CGATGGTAAT GTAAAAGAAT TAACTTACCA ACCAAATGGT AGTGTTATCG AAGTTTCTGG 17940 

TGTCTATAAA AATCCTAAAA CAAGTAAAGA AGAAACAGGT ATTCAGTTTT TCACGCCATC 18000 

TGTTACTAAG GTAGAGAAAT TTACCAGCAC TATTCTTCCT GC AG AT ACTA CCGTATCAGA 18060 

ATTGCAAAAA CTTGCTACTG ACCATAAAGC AGAAGTAACT GTTAAGCATG AAAGTTCAAG 18120 
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TTAGCAGAAG 


ATGTTGATTT 


GAAATTAGTG 


GCTCAACAAA 


CTCCAGGCTT 


1 8900 


TGTTGGTGCT 


GATTTAGAGA 


ATGTCTTGAA 


TGAACCAGCT 


TTAGTTGCTG 


CTCGTCGCAA 




TAAATCGATA 


ATTGATGCTT 


CAGATATTGA 


TGAAGCAGAA 


GATAGAGTTA 


TTGCTGGACC 


1 9020 


TTCTAAGAAA 


GATAAGACAG 


TTTCACAAAA 


AGAACGAGAA 


TTGGTTGCTT 


ACCATGAGGC 


19080 


AGGACATACC 


ATTGTTGGTC 


TAGTCTTGTC 


GAATGCTCGC 


GTTGTCCATA 


AGGTTACAAT 


19140 


TGTACCACGC 


GGCCGTGCAG 


GCGGATACAT 


GATTGCACTT 


CCTAAACACG 


ATCAAATGCT 


192&0 


TCTATCTAAA 


GAAGATATGA 


AAGAGCAATT 


GGCTGGCTTA 


ATGGGTGGAC 


GTGTAGCTGA 


19260 


AGAAATTATC 


TTTAATGTCC 


AAACCACAGG 


AGCTTCAAAC 


GACTTTGAAC 


AAGCGACACA 


19320 


AATGGCACGT 


GCAATGGTTA 


CAGAGTACGG 


TATGAGTGAA 


AAACTTGGCC 


CAGTACAATA 


19380 


TGAAGGAAAC 


CATGCTATGC 


TTGGTGCACA 


GAGTCCTCAA 


AAATCAATTT 


CAGAACAAAC 


19440 




ATTGATGAAG 


AGGTTCGTTC 


ATTATTAAAT 


GAGGCACGAA 


ATAAAGCTGC 


19500 


TGAAATTATT 


CAGTCAAATC 


GTGAAACTCA 


CAAGTTAATT 


GCAGAAGCAT 


TATTGAAATA 


19560 


CGAAACATTG 


GATAGTACAC 


AAATTAAAGC 


TCTTTACGAA 


ACAGGAAAGA 


TGCCTGAAGC 


19620 


AGTAGAAGAG 


GAATCTCATG 


CACTATCCTA 


TGATGAAGTA 


AAGTCAAAAA 


TGAATGACGA 


19680 


AAAATAACCC 


TGAGAGAGGC 


TGCAGCCTCT 


CTTTTTTGTG 


CAGTTTAGGA 


GCTAAAGGGA 


19740 


ACAGAATGGA 


GAAAATGGAA 


CAAATGTGTT 


TTCTAATCTG 


TTAGACTGTA 


TCTAGAAAGG 


19800 


GGAAAATTAT 


GATTAAAGAA 


TTGTATGAAG 


AAGTCCAAGG 


GACTGTGTAT 


AAGTGTAGAA 


19660 


ATGAATATTA 


CCTTCATTTA 


TGGGAATTGT 


CGGATTGGGA 


GCAAGAAGGC 


ATGCTCTGCT 


19920 
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TACATGAATT 


GATTAGTAGA 


GAAGAAGGAC 


TGGTAGACGA 


TATTCCACGT 


TTAAGGAAAT 


19980 


ATTTCAAGAC 


CAAGTTTCGA 


AATCGAATTT 


TACACTATAT 


CCCTAAACAG 


GAAAGTCAGA 


20040 


AGCGTAGATA 


CGATAAAGAA 


CCCTATGAAG 


AAGTGGGTGA 


GATCAGTCAT 


CGTATAAGTG 


20100 


AGGGGGGTCT 


CTGGCTAGAT 


GATTATTATC 


TCTTTCATGA 


AACACTAAGA 


GATTATAGAA 


20160 


ACAAACAAAG 


TAAAGAGAAA 


CAAGAAGAAC 


TAGAACGCGT 


CTTAAGCAAT 


GAACGATTTC 


20220 


GAGGGCGTCA 


AAGAGTATTA 


AGAGACTTAC 


GCATTGTGTT 


TAAGGAGTTT 


ACTATCCGTA 


20260 


CCCACTAGTA 


AGTCATGCAA 


AAAAAATGAA 


AAAAATTAGA 


AAAAGTAGTT 


GACAAAGTTT 


20340 


GAAAAGGCTG 


TATAATAGTA 


AGAGTTGAAA 


ATAACAACTC 


AGGTCCGTTG 


GTCAAGGGGT 


20400 


TAAGACACCG 


CCTTTTCACG 


GCGGTAACAC 


GGGTTCGAAT 


CCCGTACGGA 


CTATGGTATG 


204 60 


TTGCGTCAGG 


ACCACTTGAT 


GAAAAAAAGT 


TTAAAAAAAC 


TTAAAAATCT 


TCAAAAAAGT 


20520 


GTTGACAAGC 


GAAAGCAGTT 


GTGATATACT 


AATATAGTTG 


TCGCTTGAGA 


GAAGCAAGTG 


20580 


ACAAAG AC CT 


TTGAAAACTG 


AACAAGACGA 


ACCAATGTGC 


AGGGCGCTAC 


AACGTAAGTT 


20640 


GTAGTACTGA 


ACAATGAAAA 


AAACAATAAA 


TCTGTCAGTG 


ACAGAAATGA 


GTAAGAACTC 


20700 


AAACTTTTTA 


ATGAGAGTTT 


GATCCTGGCT 


CAGGACGAAC 


CCTGGCGGCG 


TCCCTAATAC 


20760 


ATGCAAGTAG 


AACGCTGAAG 


GAGGAGCTTG 


CTTCTCTGGA 


TGAGTTGCGA 


ACGGGTGAGT 


20820 


AACGCGTAGG 


TAACCTGCCT 


GGTAGCGGGG 


CATAACTATT 


GGAAACGATA 


GCTAATACCG 


20880 


CATAAGAGTA 


GATGTTGCAT 


GACATTTGCT 


TAAAAGGTGC 


ACTTGCATCA 


CTACCACATG 


20940 


GACCTGCCTT 


GTATTAGCTA 


GTTGGTGGGG 


TAACGGCTCA 


CCAAGCCGAC 


GATACATAGC 


21000 


CGACCTGAGA 


GGGTGATCGG 


CCACACTGGG 


ACTGAGACAC 


GGCCCAGACT 


CCTACGGGAG 


21060 


GCAGCAGTAG 


GGAATCTTCG 


GCAATGGACG 


GAAGTCTGAC 


CGAGCAACGC 


CGCGTGAGTG 


21120 


AAGAAGGTTT 


TCGGATCGTA 


AAGCTCTGTT 


CTAAGAGAAG 


AACGAGTGTG 


AGAGTGGAAA 


21180 


GTTCACACTG 


TGACGGTATC 


TTACCAGAAA 


GGGACGCCTA 


ACTACGTGCC 


AGCAGCCGCG 


21240 


GTAATACGTA 


GGTCCCGAGC 


GTTGTCCGGA 


TTTATTGGGC 


GTAAAGCGAG 


CCCAGGCGGT 


21300 


TAGATAAGTC 


TGAAGTTAAA 


GGCTGTGGCT 


TAACCATA 






21338 


(2) INFORMATION FOR SEQ ID NO: 21 


+ 









(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6273 base pairs 
<B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


21 * 

mm » * 






TGTTTTTAAA 


GAGCCGTGTC 


TGGATAGACT 


TTCGGACGCA 


ACGCTCTATT 




Ow 


CTGCCTATAC 


ACAAGATTTC 


TAACCTTAGT 


CGACATGAGC 


TGAAACCTCT 


TATTTY^TTAA 




GTAGTTCACA 


AAATATTATA 


CACCTATTTT 


ATGAATAGTC 


AACTGTCTTT 


APARTAAAAT 


i an 


TTTAGAAAAT 


CATGAAAATT 


TTCTCTTTCT 


TTCC ATTTT A 


AGTGACATTC 






ACATCAAAAA 


AGCCCAGACG 


AAATTGTCTG 


AGCATTCTTT 


TATCTAGTCG 


TTT A AfTTSA Ar* 


inn 


TTGAGTTCAG 


TATGTTTAAA 


GTCTCTGTCC 


CATCATTTCT 


TCAACAAACC 


1 1U1 [Li 1 v>V> 




AGAAACTCCT 


TGGCTACTTG 


CTTTGCTGAC 


TTGCCTTCAA 


CACCGACTTfi 


f^T A TTV^ a rr 
o 1 nu 1 IvjnuL 




TGGCTCATCT 


GGCTTTCTGT 


AATCTTACCA 


GCCAATGTAT 


TAACJAAfTPT 




1 OU 


GGGTGTTTCT 


TGAGAAGAGC 


TTCTTTCATG 


AGTGGAGCCC 








TGCTTGTCAT 


CTTCCAAGAC 


CTGTAAATCA 


TAACGCTCCA 


ftTTrrnriTr 


Av> 1 LbAA I Au 


500 


CCATCCGTGA 


TTTGAATATC 


CCCTGACTGA 


ATAGCCTGAT 




I VjvjL 1 LnA i \j 


£ f n 
ooO 


GTCGCTACAT 


TGAGATTGAG 


ACCATACATT 


GATTGCAAG C 


»— 4 A r\ 1 t 1 




inn 


TCGTTAAACT 


CGAGTGTAAA 


ACCTGCCTTC 


AACTGCCCTT 


\>w\U I 1 i I 1 1 


& AfT/TV IV It 
V— AALi I \- 1 Aj AA 


T O #x 

780 


ATGGTCTTCA 


AGCCATATTC 


TTGAGCAATC 


* -ft 1 4 1 




A 1 AvjVj iAj III 


84 0 


TGATAAGACA 


TGGGTTTGAG 


ATAGGCTAGA 


TGATCCTGCT 






AAA 

900 


ACCTGATAAA 


CCTGTTCTGG 


TTCATGACTC 


ACCTTGGGTG 




PA A A PTTTr & 


9oO 


GTCACCGTAC 


CAGTAAATTC 


AGGATAGATG 


TCAATATCGC 








AGGAAGCTTG 


TCTTCCCAAA 


ATTCGGTTTA 


ACAGTCGCAG 


TCATGCTGCT 




1 DBA 


ATCAGCAACT 


TATACATATT 


CGCCAAAATT 


TCTGGTTCTG 


GACCT ATTTT 


CPf*AflPAATA 


114U 


ACCAAGTTTT 


CCTTCTCTTT 


TTGAACCAAA 


AGAGCTGGAC 


TATAAGACAG 


APPf* IftT A AT 




AAAGCCACCA 


AGGCAAAACC 


TGAGAAAATC 


GTCCGTAATT 


4 4 VJX« 4 4 4. 4 4 V*. 


PATPAt^TTTT 

ill 




AO r AGG AAGT 


TAAAGGCAAT 


GGCTAGCACT 


GCAGAAGAAA 


GTGCCCCAAT 


CAAAATCAAA 


1320 


CTGGCATTAT 


TACGGTCAAT 


TCCCAAAAGA 


ATAAAGGAAC 


CTAGTCCCCC 


TGCACCAATC 


1380 


AAGGCCGCCA 


AGGTTGCCGT 


ACCGATAATC 


AAAACAGCTG 


CCGTCCGAAT 


CCCAGACATG 


1440 


ATAACAGGCA 


TGGCGAGTGG 


AATTTCAAAT 


TTCTTGAGAC 


CTTCCCATCT 


GGTCATCCCA 


1500 


AAGGCAATCC 


CAGCCTCTTG 


CAGGTTCGGA 


TCAATTCCCT 


TCAGCCCAGT 


GATAGTATTT 


1S60 


TGCAAAATAG 


GGAAAATCGC 


ATAAATCACT 


AGAGCTGTCA 


AACCCGGCAA 


GGTCCCAATT 


1620 


CCCATCAAAG 


GGATAAAGAG 


CCCCAACAAG 


GCCAGAGACG 


GGATGGTCTG 


GAAAATACCT 


1S80 


GCAATCTGCA 


AGACCCAGTC 


GGCCAGCTTC 


TCATGATAGC 


GAAGAAAAAC 


AGCCAAGGGA 


1740 
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ATCGCAAGCA AAATAGCTAG TAACAAGGTC AAAAGCGACA ACTGCAAATG TTGAGATAGA 1800 

GCTGTCAACC AATCACTAAA ACGATCCTGA AAAGTTGCAA TTAAATTAGT CATGAACACT 1860 

ACCTCCAAAC AAGTCTGCTA CAAAGTCTGT TGCAGGCGCT TTTAAAATTG TCTCGGGATT 1920 

CGCTACCTGG CGAATTTCTC CATCCTGCAA GACAGCAATA CGGTCCGCCA ACTTCAAGGC 1980 

TTCATCCGTA TCATGGGTTA CAAAAATCGT TGTCATCCCA AACTCTTTAT GCAATTCTTT 2040 

TGTCAGAACC TGCAACTGTT TTCTCGAAAT AGCATCCAAG GCCGAAAAGG GTTCATCCAT 2100 

GAGGAAAATC TTGGGCTGAC CAATCATAGC TCGGACAATA CCGACCCGTT GCTGTTCTCC 2160 

ACCAGATAAT TCACTAGGTA AGCGATGCCC ATACTCGGCT ACTGGTAAAC CAACCTT AGC 22 2 0 

CAAAAGCTCT TCTGTTTTCT TCGTAATTTC TTCCTTGCTC CACCCCTTCA TTTCAGGAAT 2280 

GAGAGCAATA TTTTCCGCAA CTGTTAGATT TGGAAAAAGA GCAATAGCCT GTAAAACATA 2340 

ACCAGTAGAA AGACGAAGTT CACGCTCATC ATAGTCTTTG ATGCGCTTCC CATCCATATA 2400 

AATATTTCCA TCAGTTGGTT CCAAAAGACG GTTAATCATC TTGAGCATGG TCGTCTTACC 2460 

TGACCCAGAA GGCCCTACTA AAACCATAAA TTCCCCATCC TCAATCTGTA AGTTGACATC 2520 

TCTCAAGACA TCCTTTTCTG TGTAGCGCAG TGCTACATTT TTGTATTCAA TCATTCTTTG 2580 

TCCTCAATTT AAAACTTCCC TCGATTGGTC AAGTCTTCTA CCTTAGGCAT AACTTCCTTA 2640 

TTATCCCAAT GCTCCACAAT TTTCCCGTTC TCTAAACGGA AGATATCGTA CTGGGCATAA 2700 

GCAACCCCAT CAATCTGAGT CTGACCATAC CTAACCACAT AGTTTCCTTG TCCTAAGAGT 2760 

TGGAAAACAA ACTCAAAACT GACACTATAT TCAGCCACAT AGTTTTTATA AGCAGCACTT 2820 

CCTTGTCCAA TATCATCATT ATGCTGAATC AAATCGTCTG CCACATAATC ACTCCACTGC 2880 

TCTAGCTCCC CATTTTGGAA AAT77CTG7C AAGAAACCG" C.VvCCACCTT TTTATTTTCT 29-10 

GCTTTCTTAT CCAAATCCTT GATTTCAAAA TCTCCAAAAA TTTGATCTAG TTGGTCATTT 3000 

TCAGGTGTTC GATAGTAGTC AATGACATCC CAATGCTCAA CAATACAACC ATTCTCATCC 3060 

TCACGGAAAG TATCCGTCGT CACCCATTGA GCTTCTCCAC CATTCAGATA TTGATGAACA 3120 

TGAACAAACA CCAGATTGCC ATCCTCAATG GTGCGGACAA TCTTAATCTG ACGCTCTGGA 3180 

TGACGCTCAA AGAAATCTGC AAAGAAGGCT GCAAATCCTT CTTTCCCGTC AGGAACACCT 3240 

GTCGAATGTT GGATATAGGT ATCCCCTACA GACTGGGCTT GAGCCTCAGC AACTCGTCCG 3300 

TCTTGAATGG CATGGATGTA TAGGTTGTGA GCATTTTTCA CTTGTTGTGA CATATTCTAA 3 360 

ACCTCATTTC CCTTCTCTTT CAGATTCGCC AAAATTCTTT CTTGAAAACC TTCAAATTGG 3420 

TGAATTTCTT CCTCTGAAAA TCCTTTGTAA AAGATAGTAT CCAATTTCTG ACTGACACGA 34 80 
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TGCCCCACTT CTTTCTGGGA CTTGCCTAAC TCCGTTAAAA CTAAATACTT CTTACGCTTG 3540 

TCTTTTCCAC ACGGACTAAC AATTACAAGC TTTTGTTCCT CTAGCTTTTT TATCATAGTC 3600 

GTCAGCGTAT TATTCGCAAG TCCAGTCGCA AGCGCGATAT CTGTCGCAGT TGCGCAGCCA 3660 

GTTTCACTAT TCCATAAAAC CGCTAAAATC TTGCCCTGTT CACCCCTATA AAGAGCCTCA 3720 

GGATCTTGAC TCAGTAACTT TTGAAAAATC CGCCCATTCA ACAAACGAAT ATGATGGGCT 3780 

AGCAAATGAC CATCTTTCAT AACACCTCCA ATTTATTTCG ATATCGAAAT GAATAAAACA 384 0 

ATTGTAACAC TCATCGTTCT AACTGTCAAC TATTTCGATT TAGAAATAAT TTTTGATAAT 3 900 

TATCCACACC ACCATACTCC GGCTCAACTA ACTTTTAACG AGAGTTTCTA AACTCCTTCG 3960 

TCCTCCAGTC TACAAAAGCC TTCCATTCGT ACTATCCTAT ATTTTATGAG GGGACACATT 4020 

TTTCCTATCA GACCATTTAT TTTAAAGATA GAAGTAAATC ATAATTGCTT CCATCTGTTC 4080 

TTTTATAGTA TATTGAAGTT AGACTAGAGC ACTGTATCTT CTAAAACATT GAT AGAAAGC 4140 

GATTTGAATT TCCCAATCAA TTTGTTCGTA TTTATAGCAT TTCGAAACTG GAATAGGACA 4200 

CCATGACTGC TAAAAGATTT CTATAAATTC ATTTAATTTC CTCAATCAAT TTGTTCATAT 4260 

CTTATTTCAT TCCGCTATAA TTTCACCTTA CCCTATCTTT TTCGTAGCAC CCTTCAAACA 4320 

GCCTATCCCC TACCGTTTGA CGATTCCTCA CTTCGCTCCA CTTCCATTAC AGAAGTTTCT 4 380 

TCACTACTAT GGGCTCGGCT GACTTCTCAT GATTCCTTGT TACT ACT ATT TGAACGCTCA 4440 

CGAGATAGAT CTTACAAAAA ATGCTTTGAT CCACAATGGA ATCAAAGCAT TTTAAAGAGT 4S00 

TCCTCATACA TAAGCGCAGA AGTCGCAGTT CCTCTGTACT TGGCTTCTTC TCTTTTGACA 4560 

AAGCGAGCCA AGTTGAGCAA CTCAGGTGCT CGATGTTTGG GATTTAGGAG CAATTCACGA 4 620 

TTGACCAGGC CTGAGAGACG AACTGCCTGC AATTGCTCAT TTGTAGTAGG CAGTTTTTTA 4 680 

GTAGTCTCTA GGAGAGCAGC AACTAAATCT TCACTCAAAT CATGTCGAGC ATGATTGTAA 4740 

AGATCTTTTA TAAGGCTTTC TAGGTTTGGT TCTACCATCC CTACCACCTC CCTTATGGTT 4800 

TAATAATGTT TAATCAAATC AACCGTTGAA CGATCCAATT TCTTCACCAA GGCTTGTAAG 4860 

AAAGCTTGCG CTTCTAGGAA GTCATCCATT GCATAGAGGG TTTGGTGAGA ATGGATATAA 4920 

CGAGCGCAGA CACCGATAGT TGTTGATGGG ACACCACCAT TTTTCAGATG AGCTGCACCT 4 980 

GCATCTGTTC CGCCTTTACC ACAGTAGTAT TGGTACTTGA TACCAGCTTC TTCAGCCGTT 5040 

GTCAAAAGGA AATCCTTCAT CCCTGGGAGA AGCAAGTGAC CTGGATCATA GAAACGAATC 51O0 

AAGGTTCCAT CTCCAATCTT GCCTTGACCA CCGTAGACAT CACCTGCTGG TGAGCAATCA 5160 

ACTGCGAGGA AGACTTCTGG GTCAAACTTG GTTGTAGAGG TATGAGCGCC ACGCAGACCA 5220 

ACTTCTTCTT GGACGTTAGA ACCCAGATAG AGTTCATTGC CGAGTTTTTG ACCCGATAAA 5280 
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GCTTCAGCTA GCTCCCTTAC CATGAGGACA CCGTAGCGGT TATCCCAAGC TTTTGAGATG S3 40 

ATATTTTTTT CATTGGCTGT CAAAATTGCA GAACTATCTG GTACAATGGT ATCAC CAGG A 5400 

CGGATGCCAA AACTTTCTGC CTCAGCCTTG TCCGCAAAAC CACCATCAAA AACGATATCG 54 60 

GCAATGGCTG GCATGGTTGG TCCCCCCTTT CCACGAGTCA AATGCGGACG AACAGAACCT 5520 

GAAATCACAG GAATTTCATG ACCATCACGA GTCAAGAGTT TGAAACGTTG GCTGCTAACC 5580 

ACCATGGGGT TCCAGCCACC GATTTCTACG ACACGGAAGG TACCATCTGG CTTGATTTCG 564 0 

CTGACCATAA AACCAACTTC GTCCATATGA GAAGCGACCA AGACGCGCGG TGCATCCACA 5700 

GCTTCTGAAT GTTTGATACC AAAAATACCA CCCAAGCCAT CTGTCACCAC TTCATCCACA 57 60 

TGCGGTGTCA ACTTTTCACG AAGATAAGCA CGGACAGGCG CTTCATGACC TGAGACTGCA 5820 

GCAAGTTCTG TTACTTCTTT AATTTTTGAA AATAATGTTG TCATTTCAGT TCCTTCTTTC 5B80 

TTTCATCCAT TTTACCACTT TTTATAGGAG AAGGATAGTG GGAAGGTGGA TTTCTAAGTT 5940 

4 AGTATCTTAG TCCTGCTCTA TCTTAGAAAA GGATAGTATT CTCTTGCATG TAGTGCAAAA 6000 

TCTAGTAAAC ATTCCAAAAT TAACTCGAAT ATTTATTTCC AAACAAAAAA ACAATACACC 6060 

ATCAAAGTTG TTTGGATTTT TCATGAAATT TACAGAAAAT AGTTGACTTC CCTTTCTTCT 6120 

TTCTTTAAAT ATATAGTTGG TTGAGTTTGG AATAGTACGC TGTAGCTGCT AAAACATTTC 6180 

TAGAAATTAA TTTGACTTTC CTAATAGAGT TGTTCATATC TTATTTCAAT TTACTATAGT 6240 

ACAAAACTAG AAAAGGAAAA AATCATGACC AGG 627 3 



(2) INFORMATION FOR SEQ ID NO: 22: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACAACCTTTT TCAAAAACTC ACCTTGGTAC GGAGATGTTT TGCTTTCTGC TATTATTTTC 60 

GGTTATATTC ATATCAATTT TGCTTTAACT CCTCTTGCTT TTTTCATTTA TCCTAGTGGA 120 

GGTCTTATTT TAGCTCTATT GTATCGCATG ACTAAAAATC TCTACTATCC AATACTAGTT 180 

CATATTCTCA TTAAT ATCAC TGCCTTCTGG GATGTGTGGT TGCTCCTATT TTCAGGAAGT 240 

TAGCTTACTA AAATAATGTC GGAACTTTCC GGCATTTTCT TTTTTCACAA ATAGTCAACG 300 

TTTTTCTTTT CGATATTGTA GTGGTGTGTA TCCAGTTATT TTTTTGAATT GATTTTGAAA 3 60 
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ATAAGGTTGA 


CTTGAGAAAG 


GCAGATAGTG 


AAGATAGTTA 


AGAAGAATAG 


GATGTTCTTT 


4 20 

*W mm V/ 


TTTCCTTTTT 


GGAAAACTTC 


TAAAATATGG 


TATAATGAAA 


AGATAAAGAA 


GTTGGGGGTA 


4 fin 


GAAGATGAAC 


ATTCAACAAT 


TACGCTATGT 


TGTGGCTATT 


GCCAATAGTG 


GTACTTTTCC 

x^ a rtx« m\ A m L V* VJ 


J M w 


TGAAGCTGCT 


GAAAAGATGT 


ATGTTAGTCA 


GCCGAGTCTG 


TCTATTTCTG 


TTCGTGATTT 


Aon 

u u u 


GGAAAAAGAG 


TTGGGCTTTA 


AGATTTTCCG 


TCGGACCAGC 


TCAGGGACTT 






TCGTGGGATG 


GAATTTTATG 


AAAAATCGCA 


AGAATTGGTT 


AAACCATTTG 


AT ATTTTTP A 




AAATCAGTAT 


GCCAATCCTG 


AAGAAGAAAA 


AGATGAATTT 


TCTGTTGCTA 


GCCAGCACTA 




TGACTTCTTG 


CCACCAACTA 


TTACGGCCTT 


TTCAGAGCGC 


TATCCTGACT 


ATAAGAACTT 




CCGTATTTTT 


GAATCAACTA 


CTGTTCAAAT 


ATTAGATGAA 


GTGGCGCAAG 

Vp* * XJ W X» X^X^ «k/»X# 


GGCATAGTGA 




GATTGGGATT 


ATCTACCTCA 


ACAATCAAAA 


TAAAAAGGGG 


ATTATGCAAC 


GCG TTGAAAA 


o fin 


ATTAGGTCTG 


GAGGTCATCG 


AATTGATTCC 


TTTCCATACC 


CATATTTATC 






TCATCCTTTA 


GCCCAGAAAG 


AGGAATTAGT 


CATGGAGGAT 


TTAGCGGATT 




i nan 

1UOU 


TCGTTTCACT 


CAAGAGAAAG 


ACGAGTACCT 


TTATTATTCA 




Tf GATAfPAf: 




CGCTAGCTCA 


CAGATGTTTA 


ATGTGACAGA 


CCGTGCCACC 

X» ^mf m> X*X»i^» a<X^f* X^ 


TTGAATGGTA 






GACGGACGCC 


TATGCGACAG 


GTTCTGGATT 


TTTAGATAGT 


GACACTGTTA 






AGTTATTCGT 


CTCAAGGATA 


ACCTAGATAA 


CCGCATGGTC 


^ATGTTAAAr 


GTGAAGAAfIT 
VJ 1 V?/Wu A4\\J 1 


i ton 


GGAGCTTAGT 


CAAGCTGGGA 


CTCTCTTCGT 


AGAAGTCATG 

* m XJ r *J %X^ 4 X»*» ** X* 


CAAGAATATT 


1 1 \J*\ 1 ^Annn 


i ion 


GAGGAAATCA 


TGAAAAAAAG 


AGCAATAGTG 


GCAGTCATTG 

X«V X»# A X**T^ 4 a> XJ 


TACTGCTTTT 


1 1 OuOv 1 \j 


my 


GATCAGTTGG 


TCAAATCCTA 


TATCGTCCAG 


CACATTCCAC 


TGGGTGAAGT 


GCCCTCCTCC. 


i *^on 


ATCCCCAATT 


TCGTTAGCTT 


GACCTACCTG 


CAAAATCGAG 

X* • mm mm X« » #> X^ x**r ^>X^ 


GTGCAGCCTT 


TTTTATfrT & 
* » v» i n i V— l l /a 


1 3 0 V 


CAAGATCAGC 


AGCTGTTATT 


CGCTGTCATT 


ACTCTGCTTG 

■ *X» * Xv 4- ^mT%m* *- *» X* 


TCGTGATAGG 


1 V* A -111 V^»J 




TATTTACATA 


AACACATGGA 


GGACTCATTC 


TGGATGGTCT 


TGGGTTTGAC 


TCTAATAATC* 

* X« M. fVl A «Xam A x* 




GCGGGTGGTC 


TTGGAAACTT 


TATTGACAGG 


GTCAGTCAGG 


GCTTTGTTGT 


GGATATGTTC 


1740 


CACCTTGACT 


TTATCAACTT 


TGCAATTTTC 


AATGTGGCAG 


ATAGCTATCT 


GACGGTTGGA 


1800 


GTGATTATTT 


TATTGATTGC 


AATGCTAAAA 


GAGGAAATAA 


ATGGAAATTA 


AAATTGAAAC 


1860 


TGGTGGTCTG 


CGTTTGGATA 


AGGCTTTGTC 


AGATTTGTCA 


GAATTATCAC 


GTAGTCTCGC 


1920 


GAATGAACAA 


ATTAAATCAG 


GCCAGGTCTT 


GGTCAATGGT 


CAAGTCAAGA 


AAGCTAAATA 


1980 


CACAGTCCAA 


GAGGGTGATG 


TCGTCACTTA 


CCATGTGCCA 


GAACCAGAGG 


TATTAGAGTA 


2040 


TGTGGCTGAG 


GATCTTCCGC 


TAGAAATACT 


CTACCAAGAT 


GAGGATGTGG 


CTGTCGTTAA 


2100 


CAAACCTCAG 


GGAATGGTTG 


TGCACCCGAG 


TGCTGGTCAT 


ACCAGTGGAA 


CCCTAGTAAA 


2160 
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TGCCCTCATG TATCATATTA AGGACTTGTC GGGTATCAAT GGGGTTCTGC GTCCAGGGAT 2220 

TGTTCACCGT ATTGATAAGG ATACGTCAGG TCTTCTCATG ATTGCTAAAA ACGATGATGC 2280 

GCATCTAGCA CTTGCCCAAG AACTCAAGCA TAAAAAGTCT CTCCGCAAA? ATTGGGCGAT 2340 

TGTTCATGGA AATCTACCTA ATGATCGTGG TGTAATTGAA GCGCCGATTG GCCGGAGTGA 24O0 

AAAAGACCGT AAGAAACAGG CTGTAACTGC TAAAGGGAAG CCTGCAGTGA CGCGTTTTCA 24 60 

CGTCTTGGAA CGCTTTGGCG ATTATAGCTT AGTAGAGTTG CAACTGGAGA CAGGGCGCAC 2520 

TCATCAAATC CGTGTCCACA TGGCTTATAT CGGCCATCCA GTCGCTGGTG ATGAGGTCTA 2580 

TGGTCCTCGC AAGACTTTGA AAGGACATGG ACAATTTCTT CATGCCAAGA CTTTAGGTTT 2 640 

TACTCATCCG AGAACAGGTA AGACCTTGGA ATTTAAAGCA GATATCCCAG AGATTTTTAA 2700 

GGAAACCTTG GAGAGATTGA GAAAGTAAGA ATGAAAAAGA AATTAACTAG TTTAGCACTT 2760 

GTAGGCGCTT TTTTAGGTTT GTCATGGTAT GGGAATGTTC AGCCTCAAGA AAGTTCAGGA 2820 

AATAAAATCC ACTTTATCAA TGTTCAAGAA GGTGGCAGTG ATGCGATTAT TCTTGAAAGC 2 880 

AATGGACATT TTGCCATGGT GGATACAGGA GAAGATTATG ATTTCCCAGA TGGAAGTGAT 2940 

TCTCGCTATC CATGGAGAGA AGGAATTGAA ACGTCTTATA AGCATGTTCT AACAGACCGT 3000 

GTCTTTCGTC GTTTGAAGGA ATTGGGTGTC CAAAAACTTG ATTTTATTTT GGTGACCCAT 3060 

ACCCACAGTG ATCATATTGG AAATGTTGAT GAATTACTGT CTACCTATCC AGTTGACCGA 3120 

GTCTATCTTA AGAAATATAG TGATAGTCGT ATTACTAATT CTGAACGTCT ATGGGATAAT 3180 

CTGTATGGCT ATGATAAGGT TTTACAGACT GCTGCAGAAA AAGGTGTTTC AGTTATTCAA 3 240 

AATATCACAC AACGGGATGC TCATTTTCAG TTTGGGGACA TGGATATTCA GCTCTATAAT 3300 

T >.TG AAAAT G AAACTCATTC ATC3GGTGAA TT AA.XG AAAA TTTGGGATGA CAATTCCAAT 3 360 

TCCTTGATTA GCGTGGTGAA AGTCAATGGC AAGAAAATTT ACCTTGGGGG CGATTTAGAT 3 420 

AATGTTCATC GAGCAGAAGA CAAGTATGGT CCTCTCATTG GAAAAGTTGA TTTGATGAAG 3480 

TTTAATCATC ACCATGATAC CAACAAATCA AATACCAAGG ATTTCATTAA AAATTTGAGT 3540 

CCGAGTTTGA TTGTTCAAAC TTCGGATAGT CTACCTTGGA AAAATGGTCT TGATAGTGAG 3600 

TATGTTAATT GGCTCAAAGA ACGAGGAATT GAGAGAATCA ACGCAGCCAG CAAAGACTAT 3660 

GATGCAACAG TTTTTGATAT TCGAAAAGAC GGTTTTGTCA ATATTTCAAC ATCCTACAAG 3720 

CCGATTCCAA GTTTTCAAGC TGGTTGGCAT AAGAGTGCAT ATGGGAACTG GTGGTATCAA 3780 

GCGCCTGATT CT AC AG GAGA GTATGCTGTC GGTTGGAATG AAATCGAAGG TGAATGGTAT 3840 

TACTTTAACC AAACGGGTAT CTTGTTACAG AATCAATGGA AAAAATGGAA CAATCATTGG 3900 



WO 98/18931 



PCT/US97/19588 



276 

TTCTATTTGA CAGACTCTGG TGCTTCTGCT AAAAATTGGA AGAAAATCGC TGGAATCTGG 3960 

TATTATTTTA ACAAAGAAAA CCAGATGGAA ATTGGTTGGA TTCAAGATAA AGAGCAGTGG 4020 

TATTATTTGG ATGTTGATGG TTCTATGAAG ACAGGATGGC TTCAATATAT GGGGCAATGG 4080 

TATTACTTTG CTCCATCAGG GGAAATGAAA ATGGGCTGGG TAAAAGATAA AGAAACCTGG 4140 

TACTATATGG ATTCTACTGG TGTCATGAAG ACAGCTGAGA TAGAAGTTGC TGGTCAACAT 4 200 

TATTATCTGG AAGATTCAGG AGCTATGAAG CAAGGCTCGC ATAAAAAGGC AAATGATTGG 4260 

TATTTCTACA AGACAGACGG TTCACGAGCT GTGGGTTGGA TCAAGGACAA GGATAAATGG 4 320 

TACTTCTTGA AAGAAAATGG TCAATTACTT GTGAACGGTA AGACACCAGA AGGTTATACT 4 380 

GTGGATTCAA GTGGTGCCTG GTTAGTGGAT GTTTCGATCG AGAAATCTGC TACAATTAAA 4440 

ACTACAAGTC ATTCAGAAAT AAAAGAATCC AAAGAAGTAG TGAAAAAGGA TCTTGAAAAT 4500 

AAAGAAACGA GTCAACATGA AAGTGTTACA AATTTTTCAA CTAGTCAAGA TTTGACATCC 4560 

TCAACTTCAC AAAGCTCTGA AACGAGTGTA AACAAATCGG AATCAGAACA GTAGTAGAAA 4 620 

AGAAGGTTTT AGGGCCTTCT TTTTCCTATC AACTCTTTTC TATTTCCTGT TATTCATGTT 4680 

ATAATGGATA AATATGAATA ATCGGAGTGA GACTATGAAA TACAAACGGA TTGTCTTTAA 4740 

GGTGGGTACT TCTTCTCTGA CAAATGAGGA TGGAAGTTTA TCACGTAGTA AGGTAAAGGA 4800 

TATTACCCAG CAGTTGGCTA TGCTGCACGA GGCTGGTCAT GAGTTGATTT TGCTGTCTTC 4860 

AGGTGCCATT GCGGCTGGTT TTGGAGCCTT AGGATTTAAA AAGCGTCCGA CTAAGATTCC 4 920 

TGATAAACAG GCTTCAGCAG CGGTAGGGCA GGGGCTTTTC TTGGAAGAAT ATACAACCAA 4980 

TCTTCTCTTG CGTCAAATCG TTTCTGCACA AATCTTGCTG ACCCAACATG ACTTTGTGGA 5C40 

TAAGCGTCGT TATAAAAATG CCCATCAGGC TTTGTCGGTT TTGCTCAACC GTGGGGCAAT 5100 

TCCTATCATC AATGAGAATG ATAGTGTCGT TATTGATGAG CTCAAGGTTG GGGACAATGA 5160 

CACTCTAAGT GCTCAAGTAG CGGCGATGGT CCAAGCAGAC CTTTTAGTTT TCTTGACAGA 5220 

TGTGGACGGT CTCTATACTG GAAATCCTAA TTCAGATCCA AGAGCCAAAC GCTTGGAGAG 52 80 

AATCGAGACC ATCAATCGTG AGATTATTGA TATGGCTGGT CCAGCTGCTT CGTCAAACGG 5340 

AACTGGGGGT ATGTTAACCA AAATCAAGGC TGCAACTATC GCGACGGAAT CAGGAGTTCC 5400 

TGTTTATATC TGCTCATCCT TGAAATCAGA TTCCATGATT GAGGCGGCAG AGGAGACCGA 54 60 

GGATGGTTCT TACTTTGTTG CTCAAGAGAA GGGCCTTCGT ACCCAGAAAC AATGGCTTGC 552 0 

CTTCTATGCT CAGAGTCAAG GTTCTATTTG GGTTGATAAA GGGGCTGCGG AAGCTCTCTC 5580 

TCAATATGGA AAGAGTCTTC TCTTATCTGG TATCGTTGAA GCAGAAGGAG TCTTTTCTTA 5640 

OGGTGATATC GTGACAGTAT TTGACAAGGA AAGTGGAAAA TCACTTGGAA AAGGACGCGT 5700 
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GCAATTTGGA GCATCTGCTT TGGAGGATAT GTTGCGTTCT CAAAAAGCCA AGGGTGTCTT 5760 

CATTTACCGT GACGACTGGA TTTCCATTAC TCCTGAAATC CAACTACTTT TTACAGAATT 5820 

TTAGAGGTAA ACTATGGTGA GTAGACAAGA ACAATTTGAA CAGGTACAGG CTGTTAAAAA 5880 

ATCGATTAAC ACAGCTAGTG AAGAAGTGAA AAACCAAGCC TTGCTAGCCA TGGCTGATCA 5940 

CTTAGTGGCT GCTACTGAGG AAATTTTAGC GGCTAATGCC CTCGATATGG CAGCGGCTAA 6000 

GGGGAAAATC TCAGATGTGA TGTTGGATCG TCTTTATTTG GATGCAGATC GTATAGAAGC 6060 

GATGGCAAGA GGAATTCGTG AAGTGGTTGC CTTACCAGAT CCAATCGGTG AAGTTTTAGA 6120 

AACAAGTCAG CTTGAAAATG GTTTGGTTAT CACAAAAAAA CGTGTAGCTA TGGGTGTCAT 6180 

CGGTATTATC TATGAAAGCC GTCCAAATGT CACGTCTCAT GCGCCTGCTT TGACTCTTAA 6240 

GAGTGGAAAT GCGGTTGTTC TTCGTAGTGG TAAGGATGCC TATCAAACAA CCCATGCCAT 6300 

TGTCACAGCC TTGAAGAAGG GCTTGGAGAC GACTACTATT CATCCAAATG TGATTCAACT 6360 

GGTGGAGGAT ACTAGCCGTG AAAGTAGTTA TGCTATGATG AAGGCCAAGG CCTATCTAGA 64 20 

CCTTCTCATT CCTCGTGGAG GAGCTGGCTT GATCAATGCA GTGGTTGAGA ATGCGATTGT 6480 

ACCTGTTATC GAGACAGGGA CTGGGATTGT CCATGTCTAT GTGGATAAGG ATGCAGACGA 6540 

AGACAAGGCG CTGTCTATCA TCAACAATGC TAAAACCAGT CGTCCTTCTG TTTGTAATGC 6600 

CATGGAGGTT CTGCTGGTTC ATGAAAACAA GGCAGCAAGC TTCCTTCCTC GCTTGGAGCA 6660 

AGTGTTGGTT GCAGAGCGTA AGGAAGCTGG ACTGGAACCA ATTCAATTCC GCCTAGATAG 6720 

CAAAGCAAGC CAGTTTGTTT CAGGTCAAGC ACCTGACACC CAAGACTTTG ACACCGAGTT 6780 

TTTAGACTAT GTCCTTGCTG TTAAGGTTGT GAGCAGTTTA GAAGAAGCGG TTGCGCACAT 6840 

TGAATCCCAC AGCACCCATC ATTCGGATGC TATTGTGACG GAAAATGCTG AAGCTGCAGC 6900 

ATACTTTACA GATCAAGTGG ACTCTGCAGC GGTGTATCTT AATGCCTCAA CTCGTTTCAC 6960 

AGATGGAGGA CAATTTGGTC TTGGTTGTGA AATGGGGATT TCTACTCAGA AATTGCACC C 7020 

GCCTGGTCCC ATGGGCTTGA AAGAGTTGAC CAGCTACAAG TATGTGGTTG CCGGTGATGG 7080 

GCAGATAAGG GAGTAAGAGA TGAAGATTGG ATTTATCGGT TTGGGGAATA TGGGTGCTAG 7140 

CTTGGCAAAA TCTGTCTTGC AGACTAGGAC GTCAGATGAG ATTCTCCTTG CCAATCGTAG 7200 

TCAAGCTAAG GTAGATGCTT TCATTGCAGA CTTTGGTGGT CAGGCTTCCA GCAATGAAGA 7260 

AATGTTTGCA GAAGCAGATG TGATTTTTCT AGGAGTTAAG CCTGCTCAGT TTTCTGAACT 7320 

GCTTTCTCAA TACCACACCA TCCTTGAAAA AAGAGAAAGT CTTCTTTTGA TTTCGATGGC 7380 

AGCTGGATTG ACCTTAGAAA AACTAGCAAG TCTTATCCCA AGTCAACACC GAATTATTCG 7440 
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TATGATGCCT AATACCCCTG CTTCTATCGG GCAAGGAGTG ATTAGTTATG CCTTGTCTCC 7500 

TAATTGCAGG GCTGAGGACA GTGAGCTCTT TTATCAGCTT TTAGCCAAGG CTGGTCTCTT 7560 

CGTTGAACTA GGAGAAAGTT TAATCGATGC AGCGACAGGT CTTGCAGGTT GTGGACCAGC 7620 

CTTTGTCTAT CTTTTTATCG AGGCCTTGGC AGATGCAGGT GTTCAGACAG GATTACCACG 7680 

AGAAATAGCA TTGAAAATGG CAGCACAAAC TGTGGTAGGA GCTGGGCAAT TGGTCCTTGA 7740 

AAGTCAGCAA CATCCTGGAG TATTGAAAGA CCAAGTCTGT AGCCCAGGCG GTTCGACTAT 7800 

CGCTGGTGTA GCAAGCCTAG AAGCGCATGC TTTCCGAGGA ACAGTCATGG ATGCAGTTCA 7860 

TCAAGCCTAC AAACGAACAC AAGAACTAGG TAAATAAGAG GTAGTTTTGA CTGCCTCTTT 7920 

TATGGTGGCT GAAATGAGAA GACACAAAAA GATTGTCACA AACCCCTATT TTTTTGATAG 7980 

AATAGAAGTA GTAAAAAAGA AATGAGTTAG ACATGTCAAA AGGATTTTTA GTCTCTCTTG 8040 

AGGGACCAGA GGGAGCAGGC AAGACCAGTG TTTTAGAGGC TCTGCTACCA ATTTTAGACC 8100 

AAAAAGGAGT AGAGGTGTTG ACGACCCGTG AACCTGGCGG AGTCTTGATT GGGGAGAAGA 8160 

TTCGGGAAGT GATTTTGGAT CCAAGTCATA CTCAGATGGA TGCTAAAACA GAGCTACTTC 8220 

TCTATATTGC CAGTCGCAGA CAGCATTTGG TGGAAAAAGT TCTTCCAGCC CTTGAAGCTG 8280 

GCAAGTTGGT CATCATGGAT CGTTTTATCG ATAGTTCTGT TGCCTATCAC GGATTTGGTC 8340 

GTGGCTTAGA TATTGAAGCC ATTGACTGGC TCAATCAGTT TGCGACAGAT GGCCTCAAAC 8400 

CCGATTTGAC ACTCTATTTT GACATCGAGG TGGAAGAAGG GCTGGCTCGT ATTGCTGCTA 8460 

ATAGTGACCG CGAGGTTAAT CGTTTGGATT TGGAAGGGTT GGACTTGCAT AAAAAAGTTC 8520 

GTCAAGGCTA CCTTTCTCTT CTGGATAAAG AGGGAAATCG CATTGTCAAG ATTGATGCTA 8580 

GTCTCCCTTT GGAGCAAGTT GTGGAAACTA CCAAGGCTGT CTTGTTTGAC GGAATGGGCT 864 0 

TGGCCAAATG AAACAAGATC AACTAAAGGC TTGGCAACCA GCTCAGTTTG ACCGTTTTGT 8700 

CCGTATCTTA GAACAAGACC AGCTCAATCA CGCCTATCTC TTTTCAGGTT TCTTTGAAAG 8760 

CTTGGAAATG GCGCAATTTT TAGCTAAGAG CCTCTTTTGT ACGGATAAAG TTGGCGTCTT 8820 

ACCATGTGAG AAATGCCGAA GTTGCAAGCT GATTGAACAG GGAGAATTTC CCGATGTCAC 8880 

CTTGATTAAA CCAGTTAATC AGGTCATTAA GACGGAACGC ATTCGAGAAT TGGTGGGTCA 8940 

GTTTTCTCAA GCAGGGATTG AAAGCCAGCA ACAGGTCTTT ATCATCGAGC AAGCCGATAA 9000 

AATGCATCCC AACGCAGCCA ATTCTCTGCT CAAGGTCATC GAAGAACCCC AGAGTGAAGT 9060 

TTATATTTTC TTCTTGACTA GCGATGAGGA AAAGATGTTA CCGACAATCC GAAGTCGGAC 9120 

TCAGATCTTC CACTTTAAAA AGCAAGAAGA AAAACTTATC TTACTCTTAG AACAAATGGG 9180 

ACTTGTTAAG AAAAAAGCGA CTCTTTTACC TAAGTTTAGT CAATCGCGAG CTGAAGCAGA 9240 
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AGAAGGTCAT TTTTTGCCAA CATACGAAAA TAAAGAAGAA G3WW\GCTG AGATAGAGAA 11040 

AACATTTGAG AAATATATTT TAGAATTTGA TAATATTCCA GAAAATTTAA AAGATAAGAG 11100 

AGCTGATGAA GTTGACAGAA CTCCAGCAGA AAACCTTGCT TATCAGGTTG GTTGGACCAA 11160 

CTTGGTTCTT AAATGGGAAG AAGATGAAAG AAAGGGGCTT CAAGTAAAAA CACCATCGGA 11220 

TAAATTTAAA TGGAATCAAC TTGGTGAATT ATATCAGTGC TTCACAGATA CCTACGCTCA 11280 

TTTATCTCTG CAAGAGTTGA AAGCAAAATT AAATGAAAAT ATTAATTCTA TCTCTGCAAT 11340 

GATTGATTCG TTGAGTGAGG AAGAATTATT TGAACCGCAT ATGAGAAAGT GGGCTGATGA 11400 

AGCGACTAAA ACAGCGACTT GGGAAGTGTA TAAGTTTATT CATGTAAATA CGGTTGCACC 114 60 

TTTTGGAACT TTCAGAACTA AAATCAGAAA ATGGAAGAAG ATAGTATTAT AAATTATATT 11520 

TTTAACTTTA AAAAATTTCA TAAAAATGGT TACCAAAGGC GATAGAAGAA AAACTATCGT 11580 

CTTTTTCT' l T GCAAATTTTT AAGAAGGGAG GTGATCTTGC ATGGACTTTG AATATTTTTA 11640 

TAACAGAGAA GCGGAAAGAT TTAACTTCTT AAAAGTACCG GAG AT ATT AG TTGATAGAGA 11700 

AGAATTTCCG GGCTTATCAG CAGAAGCAAT TATCCTTTAT TCCATACTTC TTAAACAGAC 11760 

AGGAATGTCA TTTAAGAATA ACTGGATAGA CAAGGAAGGC AGAGTATTTA TCTATTTTAC 11820 

TGTCGAAGAA ATTATGAAAA GAAGAAATAT CTCAAACCCA ACTGCCATAA AAACATTAGA 11 830 

TGAGCTTGAT GTAAAAAAGG AATAGGACTG ATCGAAAGAG TAAGCCTTGG ACTTGGTAAG 11940 

CCGAACATCA TTTATGTTAA AGACTTTATG AGTATATTTC AGGTAAAAGA AAATGACTTA 12000 

CAGAACTCAA AAAACTTAAC TTCAGAAGTA AAAGATTTTA ACCTCAGAAG TAAAGAAAAT 12060 

GAACTTCAAG AGGTTAAGAA CCTTGACTCT AACTATATAG AGAATAATAA GAGTAAGTAT 12120 

AGTAAGAGAG AATATAGTTT TGGTGAAAAC GGACTTGGAA CATTTCAAAA TGTGTTTTTA 12180 

GCTGCTGAAG ATATATCGGA TTTACAAATC ATAATGAACT CACAGCTTGA CAATTACATT 12240 

AGACTTCCTG CAAAACTAGA ATCCTAGTTC ATGATTGATA ATGCCAGCAA TCAAATTCAT 12300 

TCCTAATCCG AAGCGTTTAC GATGATTTCC ATAGATTGTT GAAAACATTT TAAACGTTTT 12 3 60 

TACTTTGGCA AAGATGTTCT CAATCTTGCT TCTCTCCTTG GATAGCGCAT GCTTACAGGC 12420 

TTTATCTTCA GCTGTTAGCG GCTTGAGTTT GCTGGATTTA CGTGGAGTTT GTACTTGAGG 12480 

ATATATCTTC ATCAGCCCTT GATAACCACT GTCAGACAAG ATTTTACCAG CTTGTCCGAT 12540 

ATTTCTGCGA CTCATTTTGA ACAACTTCAT ATCACGACAA TAGTTCACAG CGATATCCAA 12600 

AGAAACAATT CTCCCTTGAC TTGTGACAAT CGCTTGAGCC TTCATAGCGT GAAATTTCTT 12660 

TTTACCAGAA TGATTCGCTA ATTCTTTTTT TAGGGCGATT GATTTTTACT TCCGTCCCAT 12720 

CAATCATTAC CGTGTCCTCA GAACTGAGAG GAGTTCTTGA AATCGTAACA CCACTTTGAA 12780 
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CAAGAGTTAC TTCAACCCAT TGGCTCCGAC GGATTAAGTT GCTTTCCTGA ATACCAAAAT 12840 

CAGCCGCAAT TTCTTCATAA GTTCGATATT CTCGCACATA TTGAAGAGTG GCCATAAGAA 12900 

GGTCTTCTAG GCTTAATTTA GGTTTTCGTC CACCTTTTGC GTGTTTAAGT TGATAAGCTG 12960 

TTTTTAATAC AGCTAATATC TCTTCAAAAG TCGTGCGCTG AACACCAACA AGACCCTTAA 13020 

ATCGTGCATC AGTTAGTTGT TTACTTGCTT CATCATTCAT AGAACTACTA TACCATATTT 13080 

TGTTTCGCAG GAAGTCTATT GGAAAGTAAG AAATATTGAA GCTGAGGCTA TTAGAAGAAA 1314 0 

TTGTGAGCGT GGTGCTATTT TTTCAGGTAA AATAAAATAT CACGAAGATT CACAGTTTAA 13200 

AGGAGATCAC TATGTTGAAT GTTATGCTGT TTTAGATAAT ACGGTTATAG CAAGAGATAG 13260 

AATAACAGTC CCTATCGATC CGTTATGTGG AAAAGATTTT ATAGAGTAGC ATATAATTGA 13320 

TTCTTAACTG GAATACTCAC TATCTCTTTA CATCAAGAAA ATGACTAAAC AGGGAAGTTT 13380 

GCCTTCTTCC CTTTTTTTGT TATACTAGTA GAAGAAAAAA TTAGAAAGAT TTGTGGGTGT 13440 

CAAACAGCCC AGTGGGGTGT TTTAATATGG ACTTAGGTCC CACCCAAAGA GGTATTAGTG 13500 

TCGTGTCTCA ATCTTATATC AATGTTATCG GTGCTGCTTT GGCAGGTTCT GAAGCACCTT 13560 

ACCAAATCGC AGAGCCTGGT ATTCCACTTA AACTATATGA AATGCGTGGT GTCAAGTCTA 13620 

CACCCCAGCA TAAAACAGAC AATTTTGCTG AGTTGGTTTG TTCCAATTCT TTGCGTGGGG 13680 

ATGCTTTGAC AAATGCAGTT GGTCTTCTCA AGGAAGAAAT GCGTCGCTTG GGTTCTCTTA 13740 

TCTTGGAATC TGCTGAGGCT ACACGTGTTC CTGCAGGTGG TGCCCTTGCA GTGGACCGTC 13800 

ATGGTTTCTC TCAAATGGTG AC CG AAAAAG TTGCCAACCA CCCCTTGATT GAAGTGGTTC 13860 

GTGATGAAAT TACAGAATTG CCGACAGATG TTATTACGGT TATCGCTACT GGTCCTTTGA 13920 

TAACTjATGC C7TGG^TCA.\ AAGATT^ATC C"^— WrC.- C^^7rCT^"T T7TTATTTC? 13980 

ACGATGCGGC AGCGCCTATT ATCGATGTCA ACACTATCGA TATGAGCAAG GTCTACCTCA 14040 

AATCACGTTA TGATAAGGGA GAAGCGGCCT ACCTCAATGC CCCTATGACC AAGCAAGA \T 14100 

TTATGGATTT CCATGAAGCT TTGGTCAATG CAGAAGAAGC ACCGCTTAGT TCTTTTGAAA 14160 

AAGAAAAGTA CTTTGAAGGA TGTATGCCTA TCGAAGTCAT GGCCAAACGT GGCATTAAAA 14220 

CTATGCTTTA TGGCCCTATG AAGCCAGTCG GTCTTGAGTA CCCAGACGAC TATACAGGAC 14 280 

CTCGTGATGG AGAATTTAAA ACACCTTATG CGGTTGTGCA ACTTCCTCAG GATAATGCAG 14 340 

CTGGTAGCCT CTACAATATT GTTGGTTTCC AGACCCACCT CAAATGGGGA GAACAAAAGC 14400 

GTGTCTTCCA AATGATTCCG GGTCTTGAAA ATGCGGAGTT TGTCCGTTAT GGTGTGATGC 14460 

ATCGCAATTC TTACATCGAT TCACCAAATC TTCTTGAGCA GACTTACCGT TCTAAGAAAC 14520 
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AACCAAATCT CTTCTTTGCT GGTCAAATGA CGGGTGTGGA AGGCTATGTT GAGTCGGCGG 14580 

CTTCAGGCTT AGTTGCGGGA ATTAACGCAG CTCGTCTCTT CAAGGAAGAA AGCGAGGCTA 14640 

TTTTCCCCGA GACGACAGCG ATTGGAAGCT TAGCTCATTA CATTACCCAT GCCGACAGCA 14700 

AACATTTCCA ACCAATGAAT GTCAATTTTG GGATCATCAA GGAGTTGGAA GGCGAGCGTA 14760 

TCCGTGATAA GAAGGCTCGT TATGAAAAAA TTGCAGACCG TGCCCTTGCC GACTTAGAGG 14820 

AATTTTTGAC TGTCTAATTT TTTTGAAAGA ATTGCTCATG AT ACT AT AAA AATCTTAGAA 14 880 

ATTGTGATAA AATAGGTAGG ATGAAAGAAG GAGAGTGAAA ATCGCGAATC CCAAGTATAA 14940 

ACGTATTTTA ATCAAGTTAT CAGGTGAAGC CCTTGCCGGT GAACGTGGCG TAGGGATTGA 15000 

TATCCAAACA GTTCAAACAA TCGCAAAAGA GATTCAAGAA GTTCATAGCT TAGGTATCGA 15060 

AATTGCCCTT GTTATCGGTG GAGGAAATCT CTGGCGTGGA GAACCTGCAG CAGAAGCAGG 15120 

TATGGACCGT GTTCAGGCAG ATTACACAGG AATGCTTGGG ACTGTTATGA ATGCTCTTGT 15180 

GATGGCAGAT TCATTGCAAC AAGTTGGGGT TGATACGCGT GTACAAACAG CTATTGCCAT 15240 

GCAACAAGTG GCAGAGCCTT ATGTCCGTGG ACGTGCCCTT CGTCACCTTG AAAAAGGCCG 15300 

TATCCTTATC TTTGGTGCTG GAATTGGTTC ACCTTACTTC TCCACAGATA CAACAGCGGC 15360 

CCTTCGTGCA GCTGAAATCG AAGCAGATGC CATCCTCATG GCTAAAAATG GTCTCGATGG 1542 0 

TGTTTACAAT GCCGATCCTA AGAAAGATAA GACAGCTGTT AAGTTTGAAG AATTGACCCA 15480 

CCGTGACGTT ATCAATAAAG GTCTTCGTAT CATGGACTCA ACAGCTTCAA CCCTCTCAAT 15540 

GGACAACGAC ATTGACTTGG TTGTATTCAA CATGAACCAA CCAGGCAACA TCAAACGTGT 15600 

CGTATTTGGT GAAAATATCG GAACAACAGT TTCAAATAAT ATCGAAGAAA AGGAATAAGA 15660 

AAGAATATGG CTAACGCAAT TATTGAAAAA GCTAAAGAGA GAATGACCCA GTCTCACCAA 15720 

TCACTTCCTC GTGAATTTGG TGGTATCCGT GCTGGTCGTG CCAATGCAAG CTTGCTTGAC 15780 

CGTGTACATG TAGAATACTA TGGAGTCGAA ACTCCTCTTA ACCAAATCGC TTCAATTACG 15840 

ATTCCAGAAG CGCGTGTTTT GTTGGTAACA CCATTTGACA AGTCTTCATT GAAAGACATC 15900 

GAACGTGCCT TGAACGCTTC TGATATTGGT ATCACACCGG CTAATGACGG TTCTGTGATT 15960 

CGCTTGGTTA TCCCAGCTCT TACAGAAGAA ACTCGTCGTG ACCTTGCTAA AGAAGTGAAG 16020 

AAGGTCGGCG AAAATGCTAA AGTGGCTGTC CGCAATATCC GTCGCGATCC TATGGACGAA 160BO 

GCTAAGAAAC GAGAAAAAGC AAAAGAAATC ACTGAAGACG AATTGAAGAC TCTTGAAAAA 16140 

G AC ATT C AAA AAGTAACAGA CGATGCTGTT AAACACATCG ACGACATGAC TGCTAACAAA 16200 

GAGAAAGAAC TTTTGGAAGT CTAAAAATAA ACAGAAAAAC TCAGTTGGCA TTGCTGGCTG 16260 

AGTTTTATTC GAAAGAAGGA AATATGAATA CAAATCTTCC AAGTTTTATC GTTGGACTGA 16320 
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TCATCGATGA AAACGACCGT TTTTACTTTG TGCAAAAGGA TGGTCAAACC TATGCTCTTG 16330 

CTAAGGAAGA AGGCCAACAT ACAGTAGGGG ATACGGTCAA AGGTTTTGCA TACACGGATA 16440 

TGAAGCAAAA ACTCCGCCTG ACAACCTTAG AAGTGACTGC CACTCAGGAC CAATTTGGTT 16500 

GCGGACGTGT CACAGAGGTT CGTAAGGACT TGGGTGTCTT TGTGGATACA GGCCTTCCTG 16560 

ACAAGGAAAT CGTTGTGTCA CTCGATATTC TCCCTGAGCT CAAGGAACTC TGGCCTAAGA 16620 

AGGGCGACCA ACTCTACATC CGTCTTGAAG TGGATAAGAA AGACCGTATC TGGGGCCTCT 16680 

TGGCTTATCA AGAAGACTTC CAACGTCTTG CTCGTCCTGC CTACAACAAC ATGCAGAACC 16740 

AAAACTGGCC AGCCATTGTT TACCGTCTCA AGCTGTCAGG AACTTTTGTT TACCTACCAG 16800 

AAAATAATAT GCTTGGTTTT ATTCATCCTA GCGAGCGTTA CGCAGAGCCA CGTTTGGGGC 168 60 

AAGTATTAGA TGCGCGCGTT ATTGGTTTCC GTGAACTGGA CCGCACTCTG AACCTCTCCC 16920 

TCAAACCACG CTCCTTTGAA ATGTTGGAAA ACGATGCTCA GATGATTTTG ACTTATTTGG 16980 

AAAGCAATGG CGGTTTCATG ACCTTAAATG ACAAGTCATC TCCAGACGAC ATCAAGGCAA 17040 

CCTTTGGCAT TTCTAAAGGT CAGTTCAAGA AAGCTTTAGG TGGTCTTATG AAGGCTGGTA 17100 

AAATCAAGCA GGACCAGTTT GGGACAGAGT TGATTTAGGG AGGCTTATGA GAAAATCATT 17160 

TTACACTTGG CTCATGACCG AGCGCAATCC TAAAAGTAAC AGTCCCAAAG CAATTTTGGC 17220 

AGACCTCGCT TTTGAAGAGT CAGCCTTTCC AAAACACACA GATGATTTTG ATGAGGTCAG 17280 

TCGCTTTTTG GAGGAGCATG CCAGTTTCTC TTTTAACCTA GGAGATTTTG ACAGCATTTG 17 340 

GCAGGAATAT CTAGAACACT AGCATTTATT CATTGGGTTT GGGCTAGTAA TTTCTCCATC 17400 

CCTCTGCTAT AATAAAAAGA AATAAAAGGA TTAGAGAGGT TCTTTATTTG AAGGAACATT 174 60 

CAATAGACAT TCAACTGAGT CATCCAGATG ACCTGTTTCA TCTTTTTGGT TCCAATGAAC 17520 

GCCATCTTCG TTTGATGGAA GAAGAGCTTG ATGTTGTGAT TCATGCTCGT ACGGAGATTG 17580 

TCCACCTTTT GGGAGAAGAG TCTGCCTGTG AGGAAGCCCG TCAAGTTATT CAGGCTTTGA 17 640 

TGGTCTTGGT AAATCGTGGG ATGACCGTTG GTACGCCAGA TGTAGTCACT GCGATTAGCA 17700 

TGGTCAAAAA TGATGAAATT GACAAGTTTG TCGCCCTTTA CGAAGAAGAA ATTATCAAGG 17760 

ATAATACTGG GAAACCTATC CGTGTCAAAA CCCTACGGCA AAAGCTTTAT GTGGACAGTG 17820 

TCAAACAGCA TGATGTGACC TTTGGAATTG GGCCAGCAGG TACAGGGAAG ACCTTCCTTG 17880 

CAGTGACCTT GGCAGTGACT GCCCTTAAAC GTGGGCAAGT CAAGCGAATT ATCCTAACTC 17940 

GTCCAGCGGT GGAAGCGGGA GAGAGTCTTG GATTTCTTCC GGGTGATCTT AAGGAGAAGG 18000 

TGGATCCTTA CCTTCGTCCT GTTTACGATG CCTTGTATCA AATTCTTGGG AAAGACCAAA 18060 
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ATGTTAAGCG AAATCCGCAA TCTCAGCATT TTTGCGAAAA GCAGGGCTTT AAATCAATTG 19920 

GATGCGAGGT TAAGCAAGAA CTCTATACGG TTGTTATCGC TGAACAGAGC CTAGAAGATT 19980 

AGAAATGGCA TCAAGTAAGA ACTATTTGGA ATTTGTTTTG GAACAATTAT CAGGATTAGA 20040 

TGATGTGACT TACCGTTCCA TGATGGGGGA GTATATTCTT TACTTCCGCG GCAAGATTAT 20100 

TGGCGGCATT TATGACGATC GCTTTTTAGT TAAACCCGTG CAAGCAGTCT TAGATAAGAT 20160 

TGACCAATCT TCTTTTGAGT TTCCATACAA AGGTGCCAAA GAAATGATTT GAGTGGAAGA 20220 

ACTTGATAAT AAGATGTTTC TATAAGACCT AATTTTAGCT ATGTATAACC AACTGCCAAC 20280 

GCCCAAACCT AAAAAGAAAA AGCAAGGGTG AACGAAGTAA AAAAGAAGTC TGCTAAGGCC 20340 

CTGTCTTTGC ACGGGTAAAA TTTTATATAT AAAAAGAAGC TGGGACTAAA GAGCTCAGCT 20400 

TCCTTTGGTT TATATAATTG TCATTACAAG ACGAAGTGGT TGGGCGAAAC TCTGTTGACT 20460 

TTATTCAATT TAGAGTTTCT TATGCACAAT TGACTCTGGA ACGAAAGTCT CCAGTTCCAA 20520 

4 AGTATACACT ACAATAAACC AACGATGTAA TAGCTGATGA CACAAAGCAC AGTGGGTAGG 20580 

ACTTGCGAAG TCACCCTTTT CTTTTCAAAA TTTATACTAA ATCATTGATA TCAGTGTAGT 20640 

CACGATTAAG TCCTTGAGCA ACTGGTAGGT TAGTCAAGTA ACCTTGATAA GTAGTCACAC 20700 

CTTGACGCAA CCCTTCATCT TCAGAGATTG CTTGTGCGAA TCCTTTCCCA GCCAAAGCTT 20760 

CGATATAAGG AAGAGTGACA TTGGTTAGGG CGATGGTTGA AGTGCGAGCA ACCGCACCAG 20820 

GGATATTGGC AACGGCATAG TCGAGAACAC CGTGTTTTTC ATAGACGGCT TCATCGTGCG 20880 

TTGTCACACG GTCACCTCTT TCCATAACCC CACCTTGGTC AACAGCAACG TCAACGATAC 20940 

AGAGCCTGGA CGCATTTGTT TGACCATCTC ATCTGTCACC AATTCCCGTG CTTTTGCACC 21000 

AGGGATGAGA ATGCCTCCAA TCACCACATC AGCATCTCTC ACACTTGCTT CAATCTTGAA 21060 

TGAATTAGAC ATAAGAGTTT GAATTTGACT TCCAAAGACT TCTTCTAGAA CTGAGAGACG 21120 

CTTGGAACTA ATATCTAAAA TAGTCACTTG AGCACCAAGA CCAAGGCCGA TGCGGGCAGC 21180 

ATGTGTACCG ACGACACCAC CACCGATGAT AGTTACTTTT CCTTTTGGAA CACCTGGTAC 21240 

ACCACCAAGT AGAACACCAG AGCCACCAGC TTGCTTAGTA AGGAAGTGAC CTCCGATTTG 21300 

AACAGCCATA CGACCTGCAA CCTCACTCAT AGGAACGAGG AGCGGTAGTT GTCCTTGATT 21360 

GTCACGAACA GTTTCAGTTG TTTTTGCTGT TAACATAGCA TCTGCTAATT CTGGAGCAGC 21420 

GGCCATGTGC AAGTAGGTGA AGAGAAGAAG ATCGTCGCGC AAGTAACCGT ATTCAGAACT 21480 

TAAAGATTCT TTTACTTTCA CAACCAACTC TGCTGCCCAA GCTTCACCAG CAGTAGCGAC 21540 

AATCTCAGCT CCTTGCTTTT GATAGTCAGC ATCAGTAAAG CCAGAACCGA GACCAGCATT 21600 
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TGTTTCGATA AGGACACGAT GACCACCACT AACTAAGCTA TGAACACCTG CAGGTGTGAG 21660 

GGCGACACGG TTTTCGTTAT TTTTAATTTC TTTTGGGATT CCGATTAACA TTGAGATAAC 21720 

CTACCTTTCA ATTGACGGTC TTGTTTTGGT TGTCACATTC CAGTTCATAA ATCAAAAATG 21780 

TGACGGTTTC ATTGTATATG AAACCGCTTC AAAAATCAAG AAAAACTTGT CATCCAAATT 21840 

TTTTTATGCT AGACTAGTGA AAATCAAGCT CTAATGGAGG GAAAAGTATG GAATCAATAT 21900 

TTGTGAAATT TGCCCAGTAT CCGTCTATAG AAACGGAGCG TTTATTCCTC AGACCTGTAA 21960 

CTTTGGATGA TGCGGAAcAA TGTTTGACTA TGCCTCGGAC AAGGGTAATA CACGTTACAC 22020 

TTTTCCAACC AATCAAAGCT TGGAAGAAAC CAAGAATAAC ATTGCTCAGT TCTACTTGGC 22080 

TAATCCCTTG GGACGTTGGG GAATAGAACT AAAAAGCAAT GGTCAGTTTA TTGGAACCAT 2214 0 

TGACTTGCAC AAGATTGATT CTGTTCTTAA GAAGGCAGCT ATTGGCTACA TTATCAATAA 22200 

AAAGTATTGG AATCAAGGAT TAACGACAGA AGCCAATCGT GCTGTGATTG AGCTAGCTTT 22260 

TGAGAAGATA GGGATGAATA AGTTGACTGC CCTTCACGAT AAGGCTAATC CCGCGTCAGC 22320 

AAAGGTCATG GAGAAATCAG GCATGCGTTT TTCCCATGCA GAACCATATG CTTGTATGGA 22 380 

CCAGCATGAA AAAGGCCGAA TCGTGACAAG AGTTCATTAT GTCTTGACCA AGGAAGACTA 22440 

TTTTGCAAAT AAATAAGCAG TTGAAAAGAA ATTTTTCGAC TGTTTTTTCT TCCTCTTACG 22500 

AATAATCTAA GAGAGGAGAA AATATGGAAG CAATTATCGA GAAAATCAAA GAGTATAAAA 22560 

TCATCGTCAT CTGTACTGGT CTGGGCTTCC TTGTAGGAGG ATTTTTCCTG CTAAAACCAC 22620 

CTCCACAAAC ACCTGTCAAA GAGACGAATT TGCAGGCTGA AGTTGCAGCT GTTTCCAAGG 22680 

ACTCATCGAC CGAAAAGGAA GTGAAGAAGG AAGAAAAGGA AGAACCCCTT GAACAAGATC 22740 

TAATCACAGT AGATGTCAAA GGTGCTGTCA AATCGCCAGG GATTTATGAC TTGCCTGTAG 22900 

GTAGTCGAGT CAATGATGCT GTTCAGAAGG CTGGTGGCTT GACAGAGCAA GCAGACAGCA 22860 

AGTCGCTCAA TCTAGCTCAG AAAGTTAGTG ATGAGGCTCT GGTTTACGTT CCTACTAAGC 22920 

GAGAAGAAGC AGTTAGTCAA CAGACTGCTT CGGGGACAGC TTCTTCAACA AGCAAGGAAA 22980 

AGAAGGTCAA TCTCAACAAG GCCAGTCTGG AAGAACTCAA GCAGGTCAAG GGACTGGGAG 2 3040 

GAAAACGAGC TCAGGACATT ATTGACCATC GTGAGGCAAA TGGCAAGTTC AAGTCAGTAG 2 3100 

ACGAGCTCAA GAAGGTCTCT GGCATTGGTG GCAAAACAAT AGAAAAGCTT AAAGACTATG 23160 

TTACAGTGGA TTAAGAATTT CTCTATTCCC CTAATTTACC TGAGTTTTCT ATTACTTTGG 23220 

CTTTATTACG CTATTTTCTC AGCATCTTAT CTTGCTTTGT TGGGCTTTGT TTTTCTGCTA 23280 

GTCTGTCTCT TTATCCAATT TCCGTGGAAA TCTGCTGGTA AAGTTCTAAT AATTTGCGGA 23340 

ATCTTTGGAT TTTGGTTTGT TTTTCAAAAT TGGCAACAGA GTCAAGCGAG TCAAAATCTG 23400 
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GCGGATTCTG TTGAAAGGGT ACGGATTTTG CCTGATACTA TTAAGGTTAA TGGTGATAGT 23460 

CTATCCTTTC GTCGCAAGTC TAACGGTCGT GCTTTCCAAG TCTATTATAA ACTCCAGTCC 2 3520 

GAGGAGGAGA AAGAAGCCTT TCAAGCTTTA ACTGACCTGC ATGAGATAGG ACTAGAAGGG 23580 

AAGCTTTCGG AGCCAGAAGG GCAGAGAAAT TTTGGTGGCT TTAATTACCA AGCCTATCTG 23640 

AAGACTCAGG GAATTTACCA CACTCTCAAT ATCAAAACAA TCCAGTCACT TCAAAAGATT 23700 

GGCAGTTGGG ATATAGGAGA AAACTTGTCC AGTTTACGTC GAAAGGCTGT GGTTTGGATT 23 760 

AAGACGCACT TTCCAGACCC TATGGGCAAT TACATGACAG GACTCTTGCT GGGACATCTG 23 920 

GACACCGACT TTGAGGAGAT GAATGAGCTT TATTCCAGTC TAGGAATTAT CCACCTCTTT 23 980 

GCCCTATCTG GCATGCAGGT AGGTTTTTTC ATGAATGGAT TTAAGAAACT TCTCTTGCCA 23 940 

TTGGGCTTGA CCCAAGAAAA GTTGAAATGG CTGACTTATC CCTTTTCCCT TATCTATGCG 24000 

GGACTAACTG GATTTTCAGC ATCGGTTATT CGCAGTCTCT TGCAAAAGCT ACTGGCTCAA 24060 

« CATGGGGTTA AGGGCTTGGA TAATTTTGCC TTGACGGTGC TTGTCCTCTT TATTGTCATG 24120 

CCAAACTTTT TCTTGACAGC AGGAGGAGTC TTGTCCTGCG CTTATGCTTT TATCCTGACC 24180 

ATGACCAGCA AAGAAGGGGA GGGGCTCAAG GCTGTTACTA GTGAAACTCT AGTCATCTCC 24240 

TTGGGCATAT TGCCCATTCT ATCCTTCTAT TTTGCGGAAT TTCAACCTTG GTCTATCCTT 24300 

TTGACCTTTG TCTTTTCCTT TCTTTTTGAC TTGGTCTTCT TACCGCTCTT GTCTATCTTA 24 360 

TTTGTCCTTT CCTTTCTCTA TCCAGTCATT CAGCTGAAC? TTATCTTTGA ATGGTTAGAG 244 20 

GGCATTATTC GCTTGGTCTC GCAGGTGGCA AGGAGACCAC TTGTCTTTGG TCAACCCAAC 24480 

GCATGGCTTT TAATCTTATT GTTAATTTCC TTGGCTTTGG TCTATGATTT GAGGAAAAAC 24540 

ATTAAAGGAT TAACAGTATT GAGTTTATTG ATTACACGTC TCTTTTTCCT TACCAAGTAT 24600 

CCACTGGAAA ATGAAATCAC CATGCTGGAT GTOGGGCAAG GAGAAAGTAT TTTCTACGGG 24660 

ATGTAACTGG GAAAACCATT CTCATAGATG TAGGTGGTAA GGCAGAATCT TATAAGAAAA 24720 

TCAAAAAATG GCAAGAAAAG ATGACGACCA GCAATGCCCA GCGAACCTTG ATTCCCTATC 24780 

TCAAAAGTCG AGGAGTAGCT AAGATTGACC AGCTAATTTT GACTAACACG GACAAGGAGC 24 840 

ATGTTGGAGA TTTGTCAGAG ATGACCAAGG CTTTCCATGT AGGGGAGATT CTAGTATCAA 24 900 

AAGACAGTCT GAAACAGAAG GAATTTGTGG CAGAACTACA GGCGACTCAA ACAAAGGTGC 24960 

GTAGTATGAT AGTAGGGGAG AACTTGCCCA TTTTTGGAAG TCAGTTAGAA GTTCTATCTC 25020 

CAAGGAAAAT GGGAGATGGA GGACACGATG ATACCCTAGT TCTGTATGGG AAATTCTTGG 25080 

ATAAGCAATT TCTCTTCACG GGAAATTTGG AGGAGAAAGG AGAGAAGGAC TTGCTGAAGC 25140 
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TTATTTTTAT GCTAACACAA CTATTTGCAG AAAGATATCA AAATCATCTG GACACAGCTC 27000 

ACTTATATCC TGTTTCAAAA GTGACATTTG CAATATCCTC TCTTGGAGTT GGAGTGGGAT 27060 

ATGTAACTGT GCTGTTTATC GGAATCTGTG CCTTTT C TTT TCTAGTGGGA AGTCTGATAA 27120 

GTGGTTTTGG ACAGTTAGAT TATCCCTACC CAATTTATAG CTTAGTGAAT CAAGAAGTAA 27180 

CTATTGGGAA AATACAAGAT GTATTATTTC CTGGCTTGCT CTTAGCTTTC TTAGCCTTTA 27240 

TCGTCATTGT GGAAGTTGTG TACTTGATTG CTTACTTTTT CAAGCAAAAA ATGCCTGTCC 273O0 

TCTTTCTTTC ACTCATTGGG ATTGTTGGCT TATTGTTTGG TATCCAAACC ATTCAGCCTC 27360 

TTCAAAGGAT TGCACATCTG ATTCCCTTTA CTTACTTGCG TTCAGTGGAG ATTTTATCTG 27420 

GAAGATTACC TAAGCAGATT GATAATGTCG ATCTAAATTG GAGCATGGGA ATGGTCTTAC 274 80 

TTCCTTGCCT GATTATCTTT TTGCTATTGG GAATTCTATT TATTGAAAGA TGGGGAAGTT 27540 

CACAGAAAAA AGAATTTTTT AATAGATTCT AGCTTTCCTA TAGGTAGGGA AAATAAGTAA 27600 

AAACTAACAT AGAGAGGGAA TCAACTTGAT TCTCTCTTTT TGATTCGAAA ACCAAACCAA 27660 

AATACAAACA CAAACTTTTC AAAAAATAAC TTTTTATCTT GACAAGAGCT AGAAAACTTG 27720 

GTATCATATA AAAGTTGAGA AAAGCAGAAG TGAGAGCTTC TCGCCTTGTG ACATTAAGTT 27780 

GCCTGGCCCT ACGGATGAAA AGTTTCGAAG AAACGCTATC ATAACCTGCG GGCTTGTATA 27840 

TTTACAAGTC CGCTATTGTT TTTCTCTAAT AAAACAAAAG AGGTGAAAAC CATAGCAAAG 27900 

CAAGACTTAT TCATCAATGA TGAGATTCGT C7ACGTGAAG TTCGCTTGAT TGGTCTTGAA 27960 

GGAGAACAGC TAGGTATCAA GCCACTCAGT GAAGCGCAAG CTTTGGCTGA TAACCCTAAT 28020 

CTTGACCTAG TATTGATTCA ACCCCAAGCC AAACCCCCTG TTGCAAAAAT TATGGACTAC 28080 

GGTAACTTCA AATTTCAC? - *. CCAGAAGAAG :,VA\AG.UC AACSTAAAAA A-.V^-VXCTT 23140 

GTTACTGTGA AAGAAGTTCG TCTAAGTCCG G 28171 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7147 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CCGCTCAACT TTTGCAATCA AGGCTAAGTA GACAGCAGCA AATTTCATAT TGTATAATTT 60 

CTGACTCATA CTTCTCTCTT TCTATGTGTA CTAGTATAAA TAAGAAAAAG AAGGCCGTCA 120 
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AGCCTTCTTT TGATTTATTC TTCTGCTTCA TCTTCTGTAA ATTGACTATT GTACAAGTCA L80 

GCGTAGAAGC CACCTTGCGC CATCAGTTCC TCATAGTTGC CTTCCTCGAT GATATTTCCA 240 

TCTTTCATGA CCAAGATCAA GTCTGCATTT CGGATGGTTG ACAAGCGGTG GGCAATGACA 300 

AAGGATGTGC GTCCTTCCAT CAAACGGTCC ATGGCTTTTT GGATCAATTC CTCTGTCCGT 3 60 

GTGTCAACAG AAGAAGTCGC CTCATCCAAA ATCAAAAGCG GTGCATCCTT AAGAAGGGCA 420 

CGAGCAATAG TCAATAGTTG TTTTTGTCTT ACAGACAAGG TCACGGTGTC ATCCAAGATG 480 

GTATCATAGC CATCTGGCAA GGTCATAATA AAGTGGTGAA TTCCCACAGC CTTACTAGCT 540 

TCCATCATTC GTTCATCACT AATCCCTATT TGATTATAGA TGAGATTGTC TCGAATAGTT 600 

CCTTCAAAGA GCCAGGTATC CTGCAAGACC ATTGAAAAGG CATCATGCAC TTCTGAACGC 660 

GTCATAGCCT TGGTATCCAC ACCATCAATG CGAATACTTC CCTTATCAAT CTCATAGAAT 720 

TTCATCAAAA GATTGACAAT GGTTGTCTTA CCAGCCCCAG TCGGCCCAAC AATGGCAACC 7 80 

TTTTGACCAG CATGAGCTGT CGCAGAGAAG TCATAGTCTT GAACATTGAC ACCGTCCACC 840 

AGAATTTCTC CTGCTGACAC GTCGTAGAAA CGTGGAATCA GATTGACCAG AGTTGATTTA 900 

CCAGAACCTG TTGACCCAAT AAAGGCCACT GTTTGACCAG TTTCTGCTTT AAAGCTAACA 960 

TCTTCAATAA CTGCCTCCGA ATTTGCCGCA TAGCGgAAGG TCACATCCTT AAACTCGACC 1020 

TGACCTTTGA AGTTTTCATC AGTCAGCTGC ACTTGAACAG GGTTTTGGAT AGAAGAATGC 1080 

AAATCTAAAA CTTGATTAAT CCGCTTAGCA GAGACCATAG TTCGGGGAAG AACGATGAAG 1140 

AGTGCTCCCA TGAGAAGGAA GCCCATGACA ACCTACATGG CATAACACAT GAAAACAATC 1200 

ATGTCACTAA AGAGAGGCAG ACGCGCTATC GGAGCAGCGT CGTTAATCAC ATAGGCCCCA 12 60 

ATCCAGTAAA TCGCCACACT CAAACCACTT GAAATCCCCA TCATCATAGG ATTCAAAATA 1320 

GCCATAAGAC GGTTGACAAA CAAATTCAAA CGGCTCAATT CATCATTTAC TGCTGCAAAT 1380 

TTTTCATTTT GATAATCCTC TGCATTGTAG GCACGAACGA CACGAATACC TGTTAAACTC 1440 

TCACGAGTGA TACTGTTCAG TTTATCTGTC AGCCCCTGAA TCAAGGACTG TTTTGGAAAG 1500 

GCTAGCGTCA TCAAAACGGT CGTCATCAGG ACGTTGATAA TCACTGCCAC AAGTACGGCC 1560 

CAGAGCCAGT ATTCTGAATG ACCTAAAATC TTCCCAATAG CCCAGATAGC CATAATTGAA 1620 

CCACGCGTTA CCACTTGCAA GCCCATAGTA ATCAACATTT GAACTTGAGT AATGTCATTG 1680 

GTAGTACGCG TCAAGACGCT AGGAATTGAA AATTTCTTAA TCTCTGTCTG CGAGTAATCC 1740 

AAAACTCGGT TAAAAATATC ACTTCTCAGC CTACTAGTAT AAGAAGCCGC CACTCGGGAT 1800 

GCAAAAAATC CAACTGCAAC TACGGACAAG AAGGCAAGAA AGGACATTCC CATCATCATG 1860 

CTTGCCGACT GCCACAACTC ATCTAAATTA GTTTCTTGAC TACCTAGCAA ATCCGTAATT 1920 
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TTCGAGATAT 


AGGTCGGCAC 


TTCCAACTCT 


AGATAGACCG 


AAAAGCAAGT 


AAAGAGAATG 


1960 


GCTAGTAAAA 


TCATCCCCCA 


TTCTTTTCTA 


CTAATTCTTT 


TGGCTAATTT 


CTTTATTCTC 


2040 


TCCTCCTATT 


CCCTTGATAT 


TTTGCCTGTA 


GTTGACCGAG 


AACCTTCTCA 


AAAATCAGTA 


2100 


ATTCATCTTC 


ATCAATGTCT 


TCCATCAACT 


GCTTGTCTAT 


GCGTTCAAAA 


AAAGCCTTAA 


2160 


CCTGTTGCAT 


CTGAGAACGT 


GCTTTGTCCG 


TCAGACGAAC 


AAACTTAGCC 


CGCTTATCAA 


2220 


CAGGACTCGC 


CTCCAATTCC 


ACCAAACCAT 


TTTGCACTAT 


ACGCTTAACC 


AGATTACTAG 


2280 


CAACAGGCTT 


GGTAATATTG 


AGTTCCTGCT 


CGATATCTTT 


AATCAAGACC 


AAGTCTTGGT 


2340 


TTTTCTCGCG 


ATTATCCAAA 


AAACGCACAA 


CCTGACCTTG 


CGGCCCACCC 


ATAAATTCAA 


2400 


TGCCGCAACG 


TTTGGCTTCC 


TTTTGCACCA 


TCAGGTGAAT 


TTGATGACCA 


AAACGCTTAA 


2460 


AGACTAACAT 


CGGTTTATCC 


ATAATCTCCC 


CCTTCTAAAT 


AAAAATAGTT 


CTCTGGAGAA 


2520 


TAATTAAATT 


TCTATGAGAA 


CTATTTTCTT 


GATTAAAAAA 


ATCCCAAGTG 


ATTTTCTCAC 


2S80 


TTACGATCAT 


GTTCTATAGG 


TTAAATTAAA 


ACCCATCTAC 


GTTCGTATAA 


ATCTTTTGGA 


2640 


CGTCTTCGTC 


GTCTTCAAGA 


ACGCTGTAAA 


GTTTTTCAAA 


GGTTTCAAGG 


TCTTCGCCTG 


2700 


ACAATTCCAC 


TTCTGACTGA 


GGAATCATTT 


CCAATTCAGT 


CACTTGGAAT 


TCTTCAATAC 


2760 


CAGACTCACG 


CAGGGCAACG 


ATAGCCTTGT 


GAAGGTCAGT 


TGCCGCTGTG 


TAAACTGTGA 


2820 


TTGTACCTTC 


TTGTGCTTCT 


ACGTCATCCA 


CATCCACATC 


CGCTTCGAGC 


AATTCCTCAA 


2880 


AGACTGCGTC 


CGCATCTTCA 


CCTCCAAATA 


CAATAACACC 


TTTGTTGTCA 


AAGAGGTAAG 


2940 


AAACAGAACC 


TGAAGCGCCC 


ATGTTTCCGC 


CGTTTTTACC 


AAAGGCTGCA 


CGGACATTGG 


3000 


CTGCTGTACG 


GTTGACGTTA 


GAAGTCAAAG 


TATCCACAAT 


TAGCATAGAG 


CCATTTGGCC 


3060 


CAAAACCTTC 


GTAACGTCCT 


TCTGTAAACG 


TTTCGTCTGT 


GTTTCCTTTG 


gctitatcaa 


3120 


TCGCTTTATC 


GATAATGTGT 


TTTGGCACTT 


GGGCTTGTTT 


AGCACGGTCG 


ATAACGAATT 


3180 


TCAAAGCTGA 


GTTTGATTCT 


GGATCTGGAT 


CACCTTTTTT 


AGCTGCTACA 


TAGATTTCTA 


3240 


CACCAAATTT 


TGCATATACT 


TTAGAGTTAG 


CTCCATCTTT 


AGCCGTTTTC 


TTGGCTACGA 


3300 


TATTGGCCCA 


TTTACGTCCC 


ATTAGGAATC 


TCCTTTTTTC 


ACATTTTAAT 


CTTTCTTATT 


3360 


ATAACACAAG 


TTTTTTTGAT 


TTTCACTAGA 


GGAAATGGAT 


TTTATTAGCA 


AATCAAGCTA 


3420 


GGATAGCACT 


TTACCTGCTA 


AGATGGTCTT 


GCCTTTCTAT 


CTTTATCAAC 


AGGCACTCAT 


3480 


CCACATTCAA 

t 


AAAACAAACT 


AGACCATTAT 


CTGCAAATAG 


AAAGTTTCAG 


CCAAGTTTGA 


3540 


CAAAGTCAGC 


TCAAATTACT 


GTTTGAAGTT 


TGTAGATATA 


AGCGACAAAA 


ACAATCATAC 


3600 


TGCACCTTTT 


GTTGACAGTC 


TACTCCAGAC 


ATATCATAGT 


TCAAGTAAAT 


ACTTTGAAAT 


3660 
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4 



TCAACAGTTC 


TTATAGGCGC 


TATTGTATTC 


TAAGAAATCA 


ATAGAACAGT 


TTCTAAGCAA 


3720 


ACCTCTAATA 


CTCAATAAAA 


ATCAAAGAGC 


AAACTAGAAA 


GCTAGCCTCA 

X« A Am\M^^ A *i i H 


GGTTGCTCAA 


3780 


AACACTGTTT 


TGAGGTTGCG 


GATGGGGCTG 


ACATGGTTTG 


AAGAGATTTT 


CGAAGAGTAT 


3840 


AATTTACGTG 


TTCCCAAGAT 


GGAGAAGTTA 


GACTAGTACA 


CTGGCACTTC 


TAAAACATTG 


3<Jon 


CTAGCAATTG 


ATTTGTTCAT 


ATTTAATTTC 


ATTTTTTCCA 


TAAATGGGTA 


TTAGATATAA 


3<i fin 


ACAGCAAAAT 


ATTTCCGATA 


CGTGTCGTTC 


TTGAATTTCC 


AATCATCTAA 


AACAAGTAAA 




GGATAATCAA 


TCCCCTGTAT 


ATCAAGGAAT 


TGGCTACCCT 


TTTTACTTTT 

• * * A 4 4*44 


TTACACATTC 


dOflft 

lUOU 


TGTTTGATAG 


ATTCATTTTA 


ACATCACGAG 


CATACTCCAA 


TGGAAATCGC 


TAGGCAAGAG 




ATAAACTTTC 


AGATATCCGC 


AGAGAGATCA 


TCGCCTCTTT 


TTGTCGCAAG 






TCCTAGTCAT 


TTTCTACCTT 


ATCTTCTACC 


TGAGGATAGA 


GACTTGTTCC 






ATCGTCCGCT 


TACGCACTAG 


TGGCAAATCG 


GTTTTTTCAT 


AAACCCTACG 




*i J Z u 


CAGGCAAGCC 


CGGTACACTC 


TCTAATTTTG 


ACAGAGAGAT 


TACGAACATT 


V. i. 1 4 A AAA 


4Jt)U 


GGAATACTAG 


TGGTAAAGTG 


AGCCGTTAAA 


TCCTGCCCAT 


TTCTCTCCCA 


AflPfT^AnnA 


4 1 4 U 


GTCAAGACTT 


CCTTACCTTG 


ATGATCATAG 


GATAATTCAT 


TCCAACTAAT 


ATAATATT/V: 
AlAnini i 




GCAACATAGG 


CACCACTATG 


ATCCAGCAGT 


AAATCTCCGT 


TTCTGTAAfiP 


I i /vr\v* V_ l l A 


^ ^ An 


GTCTCAACAT 


AGTCTGTACT 


ATTTTGAAAG 


GTCGCAACTA 


CATTGTCACG 


TAAAAAAfIA A 


4 o« u 


GTTGTATAGG 


AAATCGGCAA 


GCCTGGATGA 


TCTGCTGTAA 


AGCGACTGCC 




*i DOU 


AAGTCCTCTA 


CCATATCCAC 


CTTGCCTGTT 


ACAACTCGCG 


CACCCGAACT 


X UuU t \_ UVV. ^» 


AT Afl 


CCTAAAATAA 


CCGCCTTCAC 


TTCTGTATTG 


TCCAAAATCT 


GTTTCCACTC 


TGTCTGAHGA 




GCTACCTTGA 


CTCCTTTTAT 


CAAAGCTTCA 


AAAGCAGCCT 


CT ACT*T C A TC 


AfTfPT AOTr* 

A 4 4 f\\, 4 V* 


iOOU 


GTGGT7TCCA 


ACTTGAGATA 


GACTTGGCCC 


CCATAAGCAA 


CACTCGAAAT 


ATAGAf*f*AAA 




GGACGCTCTG 


CAGAAATTCC 


TCTCTGTTTT 


AAATCCTCTA 


CCGTTACAGT 


ATCTTGAAAC 


170U 


m » jT>_r~Lri.j^m _r~„ _ i 

ACATCTCCTG 


GATTTTTAAC 


AGCATCTACG 


CTGACTGTAT 


AATAAATCTG 


CTTAAAATTA 


5040 


ACAATCTGAA 


TCTGCTTTTC 


GCCTGAATGG 


ACAGAGTTAA 


AATCAATATC 


AACAGAATTC 


5100 


CCTGTCTTTT 


CAAAGTCAGA 


ACCAAACTTG 


ACCTTGAGTT 


GTTCCATGCT 


CTGAGCCGTC 


5160 


ATTTTTTCAT 


ACTGCATTCT 


AGCTGGGACA 


TTATTGACCT 


GACCATAATC 


TTGATGCCAC 


5220 


TTAGCCAACA 


AATCGTTTAC 


CGCTCCGCGA 


ACACTTGAAT 


TGCTGGGGTC 


TTCCACTTGC 


5280 


AGAAAGCTAT 


CGCTACTTGC 


CAAACCAGGC 


AAATCAATAC 


TATAAGTCAT 


CGGAGCACGA 


5340 


TCGACCGCAA 


GAAGAGTGGG 


ATTATTCTCT 


AACAAGGTCT 


CATCCACTAC 


GAGAAGTGCT 


5400 


CCAGGATAGA 


GGCGACTGTC 


GTTGGTAGCT 


GTTACAGAAA 


TATCACTTGT 


ATTTGTCGAC 


5460 
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AAGCTCCGCT TCTTTCTTTC GATAACAACA AACTCATCGG GTAGCTGATT ACCCTCTTTG 5520 

ATGAAACGAT TTTCAATACT TTCTCCCTGA TGGGTCAAGA GTTTCTTTTT ATCGTAATTC 5580 

ATAGCTAGTA TAAAGTCATT TACTGCTTTA TTTGCCATCT TCTACCTCCT AATAAGTTCC 5640 

TGGATTGAGT TGCATAAACT CAGACTTGTT CAGCGAAATC AGCCGTGGTT GGACTAAGTA 5700 

ATCCAAAATT TCCTCGTACA ATTCTTCTGA GACATTGCGT CGCCGTCTGG CTAAATAAGA 5760 

AGTCGGAATG ACCGTATTAT CCAACATAAA TACCTTATCT AAGTCAATCA AGGTTGGTCT 5820 

TGTAAAAGGA TTACGAGCTA GATCCGGCTC TTCTATCATA AAGTTCTTGA CCAAACGTCT 5880 

GGTCAAGAGA GCTGGTTTGA AGGTCTGATT TTTAACCAAC TCTTTGTTTT TAGTCATGCT 5940 

GTTGTCAATA CAGATATACA TATGATTCTT CACAGCCAAA TCGCTACTAA TAGTCGGAAA 6000 

AGGCAAATAA AGAGCTACAA CATCTCCTCT CTTAATCAAG CAAGAGCACC CCCTTTTCTC 6060 

CTAATGTAAC ATAGACAGGA TTGACCAAGT CTTCTGATTG ACTCAGAATT TCCAAAGTTT 6120 

GAGTTTGGCG CGCTGTCAAT TTAGTAGCAT CTTGTCTCTT CAATACAAAA TGCTTGTCGC 6180 

CAATAACCTT GACAATATAA TCCTTCTCCA AAGCTGACTG GTAAATCCAC ATCAGATGTT 6240 

GTCTGTCCTG AGAACTCAAG AGAGAAGGAT TTTCAAGCCT CCCGATAGTC TGATAAAAAT 6300 

CAAAAACAGG AGCTAACTCC TGCCAATCTG ATTGGCTAGT TCTCAAGGCT AGAAAAAGGG 63 60 

CTTTGCGAGC TGATACTTCT TGGTTAGCCT TGAGAGTTAC TTTCCCCTCC AAGTTTTTTA 6420 

GAAATCGGGA AACTCCACAA AGCAAATTTT TCTCTAACTG CGAGAAATAA AAACCTTTCG 6480 

TTCCCAGACA TAAGTCTTTC ATGTCGCTTT CTCTAGCAAA TAAGAGCTCA AACATTTGAT 6540 

AGTAAAAGAA AAATATCTGG CACTGGGTCG CGCTCATCTT TTCCTTATCG GCTTCTTTTT 6600 

TTAACCAGAG CAAGGGCGAC AGGTAGCTGG ATTGAGACAT TTCCTCTACC TCCTACTCTT 6660 

TTTTAACTGG AGCATCTGCA CTAGCTGCCA CTTCTTTTGA CTGGATACTT TCCCACTGGT 6720 

TAATCTCCTC TGAGATAAGA CCTTCGCATG TCTTGACAAA TAGGGCAAAA GCCTTGGTCT 6780 

TTCCTGCATA TTTCTCCGTT TGGCATTGAT AGAGGAATTT TTCTTTCTCC AGGAGTTGCG 6840 

CAGTTTTTTG GTAAGAAATC CAATTTTCCT TTGCATTATA CAAATTGATA ATCCCCTCAC 6900 

ACAGCAAGCC GAGACTGGAT AAGGCAACCG AAATCAAACG GTAGCGATCA CCTGGCATAG 6960 

GAATAGCACA AAAGACAGCT ATGAGGAAAC CTGCCACGAT TTCTGTTATT TTTAATACCT 7020 

TATAGCGCCT ACGATGTTGA ACGCTTTTCT TTAAAAAATG AGCTATCTGT ACGTCTAATC 7080 

GCTCTGTCAG GTACATTTCT TCTGGCGTCA TATTCGTAAC TCCTTTCATT TACTTTGATA 7140 

ATCAGGG 7147 
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(2) INFORMATION FOR SEQ ID NO: 24: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 755 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D> TOPOLOGY: linear 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

CCGCATGGGA TTGGTGTCCT TTTGGGCAAT CTCTTTGACC AAACTGGAAA CATGTTTTAT 60 

GCGCCTGCCT TTACTGCCCT TGTCGGCGGT ACGTCTATAT GATCCTAGTC GCAAAAGTTC 120 

CGCGCTTTGG AGCCATTACC ACTATCGGCC TTGTCATTGC CCTCTTTTTC TTGGGAACTA 180 

AACACGGTGC TGGTTCCTTC CTTCCTGGAA TTATCTGTGG CCTCCTAGCA GATGGAGTAG 24 0 

CTCATTTAGG AAAATACAAG GACAAAACAA AGAACTTCCT TTCTTTCATT ATTTTCGCCT 300 

TTAGTACAAC AGGACCAATC TTGCTTATGT GGATTGCGCC CAAAGCCTAT ATGGCTACTC 360 

TTCTGGCAAG AGGAAAATCC CAAGAATATA TCCACCGTAT CATGGTCGCT CCAAACCCTG 420 

GAACTGTCCT TCTATTTATC GCAAGTATTG TCATCGGAGC CCTAGTGGGT GCCTTGATTG 480 

GACAAGCCTT GAGTAAAAAA TTTGCCCAGA AAATCTGATC AGTTAAAAAG AGCCACGCGG 540 

CTCTTTTTTA TTTATGGCTC AATTTCTTAG TCAAGAAATC TCCCAAGAAT TGGATTGCAA 600 

AGATAATCAA AATGATAATA ATGGTTGCCA AGATGGTCAC ATCGTGATTG TAGCGGTTAA 660 

ATCCATAAGC GATGGCTACG TTACCGATAC CACCAGCTCC AACCGCACCG GCCATAGCTG 720 

TTtcCCAACA AGGGaAtCAA GGTcACAGTC GTCAC 755 



(2) INFORMATION FOR SEQ ID NO: 25: 

li) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TTCAATTGGT ATCTCAATCA ACGGTCTTCA CATGGTTTCA ACTGGTTTGA CTCTTGAAAA 60 

AGCGAAAGCT GCTGGTTACA ACGCAACTGA AACAGGCTTT AACGATCTTC AAAAACCAGA 120 

ATTCATGAAA CATGACAACC ATGAAGTAGC AATTAAGATT GTCTTTGACA AAGATAGCCG 180 

TGAAATTCTT GGTGCCCAAA TGGTTTCACA TGATATTGCA ATTAGCATGG GAATCCACAT 240 

GTTCTCACTT GCTATCCAAG AGCATGTGAC AATTGATAAA TTGGCATTGA CAGACCTCTT 300 
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CTTCTTGCCA CACTTCAACA AACCATACAA CTACATCACA ATGGCTGCCC TTACGCCTGA 360 
AAATTAAAAA TGAATGAGCT ATCTGGCCTT AAGTTAAGGT CAGATAGTTT TTAGCTAATT 420 
TGTCCCCATA CAATTATAGT TTTTTTATCT TGTGCTTCAT TCTGTTCTGA CTTAAAATGA 480 
AAAGGTAGCT ACCAATACAA ATGATGAGGA TAAAACAAAT GACTGAAAAT CGTTATGAAC 540 
TAAATAAAAA CTTGGCACAG ATGCTCAAGG GTGGTGTTAT TATGGATGTG CAGAATCCTG 600 
AACAGGCTCG TATCGCAGAA GCTGCTGGTG CGGCAGCTGT GATGGCCTTG GAACGAATTC 660 
CGGCTGATAT TCGTGCAGCT GGAGGAGTTT CCCGCATGAG CGACCCAAAG ATGATTAAGG 720 
AAATCCAAGA AGCGGTTAGT ATTCCAGTAA TGGCTAAGGT CAGAATCGGG CATTTTGTTG 7 80 

AAGCTCAGAT TTTAGAGGCT ATTGAAATTG ATTATATCGA CGAGAGTGAA GTTCTATCTC 840 
CAGCTGATGA CCGTTTCCAT GTGGACAAGA AAGAATTCCA AGTTCCTTTT GTCTGTGGTG 900 
CTAAGCATTT CGGTGAAGCC TTGCGTCGTA TCGCTGAAGG TGCTTCCATG ATTCGTACCA 960 

AAGGAGAACC AGGGACAGGG GATATCGTCC AAGCTGTTCG TCATATCCGT ATGATGAATC 1020 

AGGAAATTCG CCGCATTCAA AACTTACGTG AGGACGAGCT TTATGTTGCT GCCAAGGATT 1080 

TGCAAGTCCC TGTAGAATTG GTCCAATATG TTCATGAACA TGGAAAATTG CCAGTTGTAA 1140 

ATTTCGCTGC TGGAGGTGTT GCAACGCCAG CAGATGCTGC GTTAATGATG CAATTAGGGG 1200 

CAGAGGGGGT CTTTGTCGGT TCAGGTATTT TCAAGTCAGG AGATCCTCTT AAACGAGCGA 1260 

GTGCCATTGT TAAGGCTGTG ACTAACTTCC GTAATCCTCA AATCCTAGCT CAAATCTCTG 1320 

AAGATTTAGG AGAAGCCATG GTTGGTATTA ATGAAAATGA AATCCAAATT CTCATGGCTG 1380 

AACGAGGAAA ATAGATGAAA ATCGGAATAT TGGCCTTGCA AGGGGCCTTT GCAGAACATG 1440 

CAAAAGTGCT AGATCAATTA GGTGTCGAGA GTGTAGAACT CAGAAATCTA GATGATTTTC 1500 

AGCAAGATCA GAGTGACTTG TCGGGTTTGA TTTTGCCTGG TGGTCAGTCT ACAACCATGG 1560 

GCAAGCTCTT ACGTGACCAG AACATGCTAC TTCCCATCCG AGAAGCCATT CTATCTGGCT 1620 

TACCAGTGTT TGGGACCTGT GCGGGCTTAA TTTTGCTGGC TAAGGAAATC ACTTCTCAGA 1680 

AAGAGAGTCA TCTAGGAACT ATGGATATGG TGGTCGAGCG TAATGCTTAT GGGCGCCAAT 1740 

TAGGAAGTTT CTACACGGAA GCAGAATGTA AGGGAGTTGG CAAGATTCCA ATGACCTTTA 1800 

TCCGTGGTCC GATTATCAGT AGTGTTGGTG AGGGTGTAGA AATTTTAGCA ACAGTGAACA 1860 

ATCAAATTGT TGCAGCCCAA GAAAAAAATA TGTTGGTAAG TTCTTTTCAT CCAGAATTGA 1920 

CTGATGATGT GCGCTTGCAC CACTACTTTA TCAATATGTG TAAAGAAAAA AGTTGAGATT 1980 

GAATTTCTCA ACTTTTTTAC ATGTAATAAA CAATAGCGAT GTATTGAAGT GCGGACGCAG 2040 
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CTAGGATAAA 


GAGATGCCAA 


ATCATGTGGA 


AATAAGGTTT 


TTTCTTGGCA 


7AAAATCCAG 


2100 




ATAACAGAGT 


CCGCCAGTTA 


CCATGAGACT 


CCAGAAAACG 


GGTGTCGTTT 


2160 


GACTGATAAT 


GGCAGCAATG 


ATAGCCAGAA 


CCAACCAGCC 


CATAATCAGG 


TAAAGAGCAA 


2220 




CPCATTGACC 


TTTTTAGCAA 


AGATTTTATA 


GAGAATACCA 


AAGATGGTCG 


2280 




GATCACAATA 


ATCAGATAGC 


CAAACCAGTT 


ATTCATCAAG 


GTCAAGACAA 


2340 






ATGGCAACGT 


AAATCATACA 


ATGGTCAATG 


ATTCGCAAAA 


2400 


P AT ATTTftTft 




TAGGCCATAG 


AGTGATAAAT 


GGTGGATGAT 


AGGAACATGA 


2460 






ATGGAAACGC 


CGATAGAGGA 


TAAAAATCCG 


TGTGCTTCAT 


2520 


A APT AT AH AT 


flfiATnAAATA 


GGCAGCAAGA 


TAAGCATGAT 


GACTGCACCC 


ACAGCATGGG 


2580 


TCACGCTATT 


AGCAATCTCC 


TCTCCAAAAC 


TGAGTTGTTT 


GCTGAGTTTA 


AGACTAGTGT 


2640 


TCATTCGATT 


ACCTCCTCTT 


GAGTATGATC 


GATTAAGTCT 


AGAGTTTGAT 


GATAGAGTTT 


2700 


AACGGTTTGG 


CAGCTGGTTT 


GGATAATAGG 


GTTAGCTGGG 


TCAATTCCTT 


GGTTCATGTA 


2760 


GTCCACAAAA 


GCATCGTAGA 


GTTGGTCTGA 


ACTTGCTTCA 


GTTTGTAGAG 


TATTAAGTGT 


2820 


CTGGGCTATT 


TCTTGAATAG 


AAAATACAGA 


CTTGAGGGTT 


GTGATAGCAA 


TCAAACGCGC 


2880 


AATCTGTTGG 


CGTTGGTATT 


TTTTTTTGTC 


AGGCTTTGTC 


AGGTAACCAT 


TTTTCACATA 


2940 


ATTGTTGACC 


ATAGATCCTG 


TTAGGCCCTT 


GTCTTTATTA 


GGAGAGATAG 


CGGCGCAGAC 


3000 


CTGATTGACA 












3010 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15213 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

CATAAATCGG TGCAAATAAC TTAATAGTGA AGTAGCCATT TCTTTCGTAT TTACCTGAGG 60 

CATATTCCCT AGACGAAAGA ATATTATTAT CAATCAAATC ATTGAATGAA CGTAGTCTTT 120 

CAACTTCTTC TACTGTTAGA TTTCTGACAA CATTTGTTGC ATAGACCTTA TTTCCATCAG 180 

GATCAGGATG GTACTCATTT GTAACTTTTC TAAGAAGTTG TTGTTTTTGA TTCGTATCCA 240 

ATTTAAGAAT TGAATTTCCT TCGAGATATT CCAACATATA AACAACGTCA AACATGTTGT 300 

GGACATATTG CTTCAAATCA TCTGCATTAT TAAATCTTGT AGTTGGATCA AGTACTTGTA 360 

ATCGTCGACT TTCTGTACTA TCAGATTTTG AATGTTTCAA GATGGAGTTG ATGGTAATGG 420 
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TCGCATCATC 


TGGATGGTCT 


GGTGCTTGTA 


ATAATCCTTT 


AGCAAAGAAC 


TCTGGTCCCA 


480 


AGCCACTTCT 


TCGACCATAT 


CCTCCAAGAT 


AAATGTCCTG 


ATCTGAGTCA 


TGTGTCATCT 


540 


CATGCGTATA 


AGTAATAGCT 


CCATCCTTAT 


CCAACATTCG 


ATAACCCATA 


TAATAAACTG 


600 


CATCACCTGT 


AGCATAAGCA 


CCGTGTTGAT 


TATGCCCAAC 


TTTATTTCCA 


ACAGGTCCAA 


660 


AGAAATGTTG 


CATTGCAGGA 


TTTGGATTAT 


CAAAATCTCC 


CACTTCTGTA 


GCTTTCCCTA 


720 


CGGTATTATC 


ATCGCCAAAT 


TTATAAGCAT 


CGTAAAGCAA 


AATATTTCTA 


TAAAGTTTTT 


780 


CACGTGCATT 


GTCGTCTAAA 


ATACGATACC 


AATAATCGTA 


GTGATCTCGC 


TGACGTTTGG 


840 


CTGTTTCACG 


CGCATTTTCT 


TCAACAAAAT 


CATTGAGAGC 


CTTGCCCGCT 


TTATGGTCAC 


900 


TACTGCGGTA 


GCGATCATAA 


GCTCCAAATC 


CTAGACTAGA 


CATGGTCGAG 


ATGACAAATA 


960 


CGGATCTCTC 


TGGCAAGGTC 


AGGAGAGGCA 


AGACCATATT 


GCGGTATTTC 


CATGTGGCAC 


1020 


TCGTGATACG 


ATCATAAACA 


CCGATAGAAT 


ACTTGGTGCC 


AGCTAACCCT 


TGCTTCGTTT 


1080 


TCACCTCTTC 


GATAGTGGAT 


TTTTCTTCGA 


CAATGTAAGC 


CTTAGTCTCT 


GATTTAAACC 


1140 


AGTCATTATT 


GCTTCTATTT 


GGTAAAAAGA 


CTTTTCGGTA 


ATGTTCCAGC 


GTGCTAAACA 


1200 


AATCTGTCGT 


TCCATGTTGA 


CTGGCAAGAC 


TGATACCATA 


AGTATCGACA 


TTATTCTTAG 


1260 


CTAGAAGATT 


GTTAAAGCCA 


GATTTACCCA 


ACTCAATCAG 


AGTATCTAAT 


GGTGAAGCAT 


1320 


TCCCCTTACC 


AAAGAAGTCC 


AAATGGTACA 


GAACTAGGTC 


TTTGACATTC 


ACCTGACCAT 


1380 


AGCTAAAGTT 


ATACCACCGT 


TCCAGATAGG 


TCAAGCCAAG 


TAGCAAGGCT 


TCCTTGTTGC 


1440 


GT7TGATTTT 


ATCTACAAGA 


TAACCTTCAG 


TGACGGGGTT 


AGCACTAGCC 


AGTCCAGCAT 


1500 


CCGCTGACAA 


GAGTTTTTTC 


AAACTGTCTT 


CCAGTTGTTG 


TTTTGTTTTG 


GCGAACTGGT 


1560 


CTTCTAGATA 


GAGCTCAGTT 


TGCTTGACGT 


TTGGACAAAT 


ACCCAGCGTC 


TTTCTGATGG 


1620 


CTTCTGAATG 


ATAGTCAACC 


TTTTGTAAGT 


CAGGTAAGAC 


TTGCTTGATG 


ATAGAGGTTT 


1680 


GGTCATACAG 


GAATTGCTTT 


GGCGTATAGA 


GAAGTCCAGT 


ATTGCCCAGA 


CTATATTCTG 


1740 


CTAATTTGGC 


GAAATCATTC 


TGGTATTTGA 


GATCCAGCTT 


CTCAGATAAA 


TCATCCTTGT 


1800 


AGTGAAGCAA 


GAGTTTGTTT 


GCAGTCTGTT 


TGTTAGAAAC 


AATGTCTGTG 


ATGACTTGGT 


1860 


TGTCCTTCAT 


CATGACTGCT 


GACAAGAGTT 


CTTTTTGATA 


TAAAAGACTG 


TTCTCATTGA 


1920 


ccagctttcc 


GTATTTGACG 


ATGGTTGCCT 


TGTTGTAGAA 


AGGTAGCAAT 


TTTTCAATGT 


1980 


TTTTATAAGT 


CAAGTTGCGC 


TTAGCTTGAT 


AATAGGCCAC 


CTTAGAAAAA 


TCACTGTCTT 


2040 


TTTTGCCACT 


TGTTGAAAGT 


GGCTCCACTG 


TTGGTAAAAT 


GAGAGGATTG 


ATTTCTGCTT 


2100 


TTTTGCTTGC 


AATTTGAGAA 


GCATCTAGCA 


TTGTTCCTCT 


TTCTTCAAAG 


GATTCCTTGC 


2160 
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TGACGACCTC ATCCTTGACC AAGGTGACAT TGTAGACTCT GTTGGCCTTG CTGCTGAATG 
TGTCCTTTAC CTTCATTTCG TTATAGTGGT AACCAGTGAT GGCATTTCCG TTGGTTACAT 
TAACATCGCT GAGAACATTG GTCAAACTTC CAGCATGCCT AACATCACCA GAAGTTCGAT 
CCCACAAATT GCCTGCCACT CCAGCGACTC TACCAAAGTG CTTGACATTG TTGATATCAC 
CTTCAGCATA GCTATCTTGG ATCTGTGCAT CTCGGTCTAC TAGGCCTGCA AGTCCACCCA 
CAGTCTGATC TGAAGTATTT GTGTTAGATG AAATGGCTAC TGTCGCTTTT GACTTAGTAA 
GTAAAGCCTT GTCACCTGTC AAATGACCGA CCATACCACC GATATTGTAG GCAGCAGTCG 
TTTCATAAGT GTTGATAATT CTTCCCTTGA AACTGCTCTC TGTGATGCTT GATTGCTCAG 
CCTTAGCCAG CAAACCACCG ATACCACGTT CACCAGCCAG AACACCATCG ACGTGAACTT 
GCTTAATTTT TGTGTTATTC TGAGCTTCAT TTGCCAGTGA ACCGATATCA TCTTTCCCTG 
AAATAGCAAC ATTTTTTAGA CTCAGTTTTT CTACTGTAGC ACCACTCAAG TTTTCAAACA 
GAGGTTTTTT CAAATTATAG ATAGCATAAT TCTTGCCATC TTTTTCACCG ATTAAACGAC 
CAGTAAAGGT GTCCTTGATA TAGGATCTTT CATCAGGACC AAGCTCCACT TCGTTAGCAT 
TCAGGCTGGC CGCTAAATGA TAGGTTCCAG AGGGATTTTG GTTTATAGCT TTGACCAGAT 
TACTAAAGGA AGTAAAGTTT GTTGTTTCTT CTGTTCCCTT CTTAGCTAGA TAGAAGGTAA 
AATTATCTTT ATATCTGCTT TCTATCTCCT GCTGAAGCTT CTCTACTTTT GCTGTGATTT 
TATAAAGGAT TTTATCATTT TTTCTTTCCT CTGATATTGA TGCTACTGGT AGGTATACAT 
CTTTGAATGA AGAAGATTTC ACTTTAACAA AGTAGCTATT TGGATTGCTT GGAACTTGCT 
CTAACGAAAT GTGTTGTTTA TAAGTACCAT TTGACAAACT GTATAACTCT AGGTCGGAAA 
CATTTCTTAA TTCAAGTGTT TTCTCTGGTT CTTCTACCTT TTTATCAGGG TCTAGTTCAT 
TTTCTTGTTT AATTTCTTCG TTTCCATTTG AATTGCATGT GTTTGATTCG GTTGAAACAT 
CCTCAGTTGA ATTTCCGTTT GATGGTTCTG GTTCTGTTTG TCCATTCTCT GATGTTGTAT 
TACCTGAATT TTCTGGTTTT GTTGCAGTTC CGTTTTTTTC TGGTTCATTT GATTCTTCAA 
CTGGTGGTTT TGAATCACTA GGTTTATTGG ATACTTCTCC AGTATTTTCG TTAGCTATTT 
TCCCAGAGTT TGTTTGTGTT TCTTCTGCAG GTTGAACTGG TTTTTCTGTT TCTTGATTTG 
AGGTACCTTC TACTGTGCCT TCATTTGGAT TTACTGGAAC TTCTTCTACA GTTTTTTCTG 
AATTTTCATT TTTAGAGTCA TTATGTTCTG GTTTATTTGA TTCTCCAACT GAGGTTGTCG 
AATCACTAGG ATTACTGGAC ACTTCCCCAG TATTTTTGCT AGATGTATCT GGTGATACTT 
TCTCTGAATT CGTTGTTGAT TCTTCTGCAG GTTGAACTGG ATTTTCTGCT TCTTGAATTG 
AGGTTCCTTC TGTAGTACCT TCATTTGGAT TTACTGGTGT TTCTTCTGTT GGTTTTACTG 



2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 
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GAACTTCTTC AGTTTTTTCT GGACCTTGTT CTTTGGTCTT CTCAACCGGA GTTTCAGGTT 4020 

TTACTTGCTC AATATTACCC TTATATTCTG GAAGCGGTGC TACCTGCTCT GGTTCACCTT 4080 

TATCACTTAC CACAGTATCT GGCGACTCTG GTTGAACCTC AGTCTCACCT TTGTCGGTCA 4140 

CAACTGCTTC GGGTAATGTA GGTTGAACTT CTGGTTCGCC TTTGTCACTT ACTACAGCTT 4200 

CGGGCAACTC AGGCTGAATT GCGGGTTCAA CAATAGCTCC AGACTGTACG TCCTTATGTT 4 260 

CTACACCAGT CTCAGGTTGT TCCTTTATAA CTTGAGTTTT TTTAGTACCT TTTTCGACTA 4320 

TTCTTGGACT AGGCGCAGTC GTTGAAGTTG AAACAATTTC TCGCGAAACT TCTTCCTTGT 4380 

TTACAGAGAA TATTCTGACG ATTTCAACTT TCTTACCTAA TTTACCTTCT TGTTTTACTC 4440 

TTACAGTTCC TTCAGCTAAA TCAGGATTTT CTTGAATTTC TTCTTGAAAA TCTATTTTTG 4500 

TCTCCATAGT TTCCTCACGA TATAAGAGTT CAGGTTTGTT CAATTGACCT GATAAAACTT 4560 

CATCCTGTGG ATTTAATGTA TTTACCCCAG TCTTTTCTTT TGGAGAAATC TTCTCCTCTT 4620 

TCTTCGTTTC TAGATTCTTA TGTTCGGCTA ATTGTTCTTG AGAATCTGAA GATTGTTTCT 4680 

CTTCTTTTCT TGGATTGATT AATTCAGTAG AGAAAGGTTT TTCAACTACT TGAACTTCTG 4740 

TCGGCTTAGT TGAAGAAACA GGTGTTTGTT CCTGAATAGC TTGTACTGTT GATGGATGGT 4800 

CTACAAAATT CGGTGTAACA TTATAATCCA CCTTTTGTTG TTTTGTAGGA GTGGCAACTG 4860 

AACTCTTTTG ATTACTTACT TCAGACTCAG AAGTCGTTTT TCCCTCTTTG ATATATCCAA 4920 

TATAAGTGTA ACCTGAAATC TCTTTAGGAA GAGGTAATTT TTCTCCAGAG GTCAATTCAT 4980 

AGTCCGTATT GTAATTTAGC AAAAGATGAT TTTCTAAAGC ATGGACTGAA ACTAAGACAC 5040 

CATTTCCTAT CCCTGCAACC AATACTAAAT GTAATACCGT TTTATTCTTA ACCTTTTTCT 5100 

TGGAAACAGC AAAAATTAAA ATTCCCATAG CAGCTAAGCT AGCACCAGCA ACTAGGGCTT 5160 

GCCTCTCATT CTTGCTTCCA GTATTTGGCA ATTCCGCCAG TTGATTTTGA GAATTTAACT 5220 

TATAAACAAG ATAATAAGTT TCATCATCAT TCTCCACGTA TGTCGGAATA TCATAGACAA 5280 

GCTGCTTCTT TTCTTCTGAT GATAGCTCTG AATCTGCCAC ATATTTATAG TGAACTCCCG 5340 

CAGTTTCTTG AGCATCCACA GATGAACTAG CTAATACAGA CATAAAAAAT AAACTTGAAA 5400 

TCGTTGCAGA TACAAGTCCT ACTGATAATT TTCTAAATGA AAAACGCTCT TGTTTTTCAC 5460 

CAAAATACTT TTCCATTATT CCTCCTTGAA ATAAAATTTA TATATGTTAC AAAGACCTTT 5520 

ATTATATTAG TGTATTATCT ATTATCTATA GAAAAGGCAG TATACCTTAA TTATACTCTT 5580 

AATTTACAAA AAAGTCTTAA AATTGAGATG CGCTTTCATA CTTT G TTTTA TATTATTTGG 5640 

AGGTACAATA ACACCTACCA TGAAATTTAC ACGGTAGGTG TTACTCATAT CACTAATCGT 5700 



WO 98/18931 



v 

PCT/US97/19588 



300 



TCTAAAAATG 


GTTTGAGGCA 


GTTGAGGAGA 


ATTCCTTCTA 


TCCAGCTTCC 


TTGTGCTGAT 


5760 


GAGCGATGGT 

^a> l*a**r* *■ ^*^a» * 


CTTCCTGCAG 


GCTTTTTTTT 


AGAAAATCTC 


GGACTTGTTC 


TGGTGCGATT 


5620 


TCAAATTCAA 


AGGCTTTCAT 


TT T AT AGAAA 


AAGTCGATGA 


GATGATCTGA 


CAGGTATTCA 


5880 


GTTGAAAAGG 


GTACTTCACC 


ACTTTTTCTA 


TATTCTAATA 


AGAGTCTAGA 


AAATCGAGCT 


5940 


TTTTCTTCAG 

X X X X W * 


GAAGCTCACG 


AAAATAGGAA 


TTGAGGATCC 


AAGTC T GCTT 


CTGTTTTCTT 


6000 


TCAATTGGAT 


CCTGACTGGC 


AATTCGTTGG 


TCTTTTTCCA 


GCTCTTTTTG 


GTATTGTTTG 


6060 


GCCTTGATAG 


CTCGTTCTGC 


TCTATTTTTA 


CCAAAAAGAA 


TTTTTTCCC A 


CTTGCGTTCT 


6120 


TCTTGAGTCA 

A Na> X X wAv X #» 


GGGTCTCTGT 

^J^P^^f Jk ^B* X ^BJ * ^bT * 


AAAGCCAAAG 


TAATCTTGAT 


AAGCACGCTC 


TGCGGGTCCC 


6180 


ATRGCTAGAA 


CCACATTGTC 


TGCATATTGC 

X * j*_ - X J* 4 X W 


TTGGCGATTT 


TATCCCTCTT 


CTTGCGTTCT 


. 6240 




GGATACGGAG 


T , T*C f T w TC5T l I*CG 

X X W X X X7 X X 


TAGTCAATTT 


TCTCCTTGCC 


TAGCTTGACA 


6300 


AGGTAGAGTT 


GGTCATCCGA 


TTTCCCAAGT 


AAAAAGGGTT 


TGATACACTT 


TTCAAGGACT 


6360 




VJ/\0\«V* 11111 


W XXX XV\bF X X %#^b> 


GCCTTGGTCC 


AACTTCCTCC 


CTGAAAGACT 

«af A 1|M "J* ■j^a**« ■ #> 


6420 


TfT l^n A A AA 


V* V» 1 ww 1 /%W 1 


TCTCTCAGGC 


GCAAATTGAT 


TGCCACGATT 


GGGTTTGAAA 


6480 


/\v.r\v_V 11111 


rrVAftAnffA 

v_*l^l^ nviivu 


TTT'T AK AAGT 


CGCTCGTCAA 


AGTTACTTTT 


ATTGACCTTG 


6540 


nl 1 I 1 1 i LV. 1 


ii x iv>i unw> 


& 1 1 1 \« 1 W 1 1 


AGATTTTCAA 


CCTTTCTGAG 


CAGTrrrrcT 


6600 


1V,L 1 V_ 1 


n I 1 Vjv. luu x v_ 


AAGGGACAAT 


CGATGAAAAT 


GACGAACACA 


GTCGCTACCA 


6660 


r\ i i uunnnuji 




TGTGACACCG 


TTAAAGAGTT 


CATAAGCGTA 


TTTGATGGCA 


6720 


111 VrLAwilUn 




ACGGCCGATA 


CCGTTAAAAA 


TAAAGGAAAC 


TTCATTCCAT 


6780 


i v- V i i SjVj l nu 




AGTATCCGCT 


TTCGAAGCCT 


GTAAAACTGC 


ATCGTGCAGG 


6840 


ft ATTTTTTAA 


CTGGAAGTGT 

\a> 4 VWrtflW 4 XV * 


CATGAGC3TCT 


CCTTTCTAAT 


ACTCAATAAA 


AATCAAAGAG 


6900 


f* AAACT AG AA 


AGCTAGCCGC 

/WJW X /l\7^\r K7^B> 


AATCAGCTCA 


AAACACTGTT 


TTGAGGTTGT 


AGATAGAACT 


6960 




GCtCAAAACA 


CTGTTTTGAG 

X X X X X **^"" r V^ 


GTTGTGGATA 


GAACTGACGA 


AGTCAgTAAC 


7020 


CATATATACA 


GCAAGGCGAA 


GCTGACGTGG 


TTTGAAGAGA 


TTTTCAAAGA 


GTATAAGTTA 


7080 


TACTTTTACA 


ACTTGAACCT 


CGl'CTTTACC 


GAGTAAAATC 


AAGTATTTTT 


CAATATTTTC 


7140 


AATCGAATAG 


GCTCGTGATA 


AAGCCTCTTC 


GTATAGAGCT 


AACTGACCAC 


GATAGCGGTC 


7200 


TACGAGTTGA 


CTTGGTTCAT 


CATAGCGGTC 


TGTCTTGTAG 


TCGAACAGAA 


CAATTTTGTT 


7260 


TTCGTAAAGC 


AGATAGCCAT 


CAAGGATACC 


ACGGACAACA 


AAGTCTTCCT 


GACTCTTTTG 


7320 


GTCTCGTTTG 


AGCATGGAGA 


AAGGTTGCTC 


GCGATAAAGA 


TGGTCGGTAT 


TAGCAAGAAT 


7380 


TTCCTGACCG 


AGTACTGTGT 


CAAAGAAAGC 


AAGAATTTTA 


TCAAGATTGA 


TCTTGTCTCT 


7440 


GACAGCTTGG 


CTAGTTTGAA 


CTTGTTTGAG 


TGTTTCTGTT 


AGGCTAGCAA 


GGGTTAGTTG 


7500 
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CTGGCTGAGG 


TCAATTCTCT 


CCATGAGTTC 


GTGAGTAGCA 


CTACCAATCT 


CAGCTCCAGT 


7560 


TACCTTTTCT 


TTGGTTGAAA 


AATCTGGCAA 


ATCGAAGCTG 


ATTTTCTTGC 


CTACTGACTG 


7620 


ACCTTGACCA 


GCAATCTCGA 


CACCTTCCAT 


ATCCATAACT 


GGTTCGTAGA 


ATTTCTTGAT 


7680 


TTGACTTGGG 


GTTTGAACAC 


TAGGAAGTTC 


AATAGCTGCG 


CGGTGAAGAG 


TATTATAAAC 


7740 


TTCCACCTCC 


TTCAGCATTT 


CCAGAGCTTC 


TTTGATGGTA 


TCTGACTGAC 


CATTGTCTGC 


7800 


TTGGCAGCTA 


TCTTGGAGAG 


GACTCTTGGT 


TTCCAACTCT 


CCGATAGCTT 


CTCTGGTCAA 


7860 


CTGATCTTCG 


CCAATAAAAC 


GATAACTAAA 


GTTGAGCTTG 


TCCTTAGTAA 


ACACTTTACT 


7920 


GATAGCCCAA 


AGCCAATCTT 


GGAAATTCCG 


TGCTTGCAGT 


CTAGTATTGC 


TATTTAGTTT 


7980 


CCCATTTTTG 


GCTGCTGGGT. 


ATTCCTTGGA 


TTCCAGCTTT 


TCACGAGAAC 


CCTTGCCGAC 


8040 


AAGATAGAGC 


TTTTTCTCAG 


CCCGCGTCAT 


AGCAACATAC 


AGCAAACGCA 


TCTGCTCAGA 


8100 


ATAGCTTGCT 


AGCTGTAATT 


CCTCTTCGTT 


CTGCCTATAG 


GTCAGACTAG 


GAATGGAGAC 


8160 


TTTGATCCTT 


TTAGGATAGT 


GGTCTTCTAC 


TGCCCCTGTC 


TCCATCTTCG 


CAATATATTT 


8220 


GACACCAAGA 


CCATTCTGAC 


GACTGAGAAT 


GACTTCTGAC 


ATAGAGTCTT 


GCTTGTTGAA 


8280 


ATCTTGATCC 


ATATTGAGGA 


TAAAGACGTA 


AGGAAACTCC 


AGCCCTTTAC 


TCTTGTGGAT 


8340 


GGTCATGAGC 


TCTACTGCAT 


CTTTTGGCGG 


TGCCACCGCC 


ACGCTTCCCA 


AATCGTGCTG 


8400 


GGCTTCTAAG 


ACTTGGTCAA 


TCATACGAAT 


AAAACGCGAC 


AAACCTTTGA 


AATTGCTCTT 


6460 


TTCAAATTGA 


TCAGCACGCA 


GTGCTAGGGC 


ATAGAGATTG 


GCCTCCCTAG 


CAGGACCATT 


8520 


CCGCAAAGCC 


CCAACATAGT 


CATAATAAAA 


ACGGTCCTTC 


TAAATCTTCC 


AAATCAACTC 


85PO 


ATAGAGAGAG 


TGGGTTTTCG 


CATACAAGCG 


CCAAGAAGCT 


ACCATATCCA 


TGAATTGCTT 


8640 


TAGTTTTTCA 


GCTAGAGCTG 


TGTGAATCAA 


GCCTTTTTGA 


CTACTTGCCA 


TTTTTTGTGC 


8700 


ATTGACCAGT 


TTCTCATAGA 


GATTTTCGTG 




TCTCCTTTCT 


GAAGGGACAA 


87 60 


ACGTGCTAGC 


TCATCCTCAT 


CAAAACCAAA 


CATTGGAGAC 


TTCATAAGGG 


CAACCAAGGC 


8820 


CTAGTCTTGC 


AGGGGATTGT 


GAATGACACG 


AAGAGTGTCT 


AGCATGACTT 


GCACTTCTAC 


8880 


GGATTGGAGA 


TAATTGTTTT 


GCTCTCCGTC 


AGTTTTGACA 


GGAATTCCGT 


ACTCAGACAG 


8940 


GGCGAGGAGA 


ATCTGGTCAT 


TACGACTGCC 


GCTGGAGGTC 


AGAAGGGCAA 


TTTCCTTAAA 


9000 


GGCAACACCT 


TTTTCTTGAT 


GAAGTTTCAG 


AATCTCCTTG 


ATAACTAAGC 


GCATTTCGCC 


9060 


TGTTAGTTTC 


GTTTCTGTTT 


GACTCTCTTC 


TTCCTCACCT 


GTATCGTCCT 


TGTCGTAGAG 


9120 


CAGAAATGCT 


CCCTTGTTGT 


CTGGATTGGG 


ACTCAGTTTG 


GTATTGGCAA 


AAACAAGCTG 


9180 


GTGCTTGTTA 


TCATAGTTGA 


TTTCGCCGAC 


CTCTTGGTCC 


ATGAGACGTT 


CAAAGACATC 


9240 
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ATTCGTTGCT GACAGCACTT CTGAACTACT ACGGAAATTT TCCTTGAGGA TAATGAGCCT 9300 

GCCTTCTTGG GGATTTTGCG CATAGCGTTG GAATTTCTCA TTGAAAATCT GCGGGTCTGC 9360 

CTGACGGAAA CGATAGATGG ATTGCTTGAT ATCTCCCACC ATAAAGCGAT TGTGGCCATT 9420 

AGACAACAAT TCCAGCATCC GTTCTTGAAT ATGGTTGGTA TCCTGATACT CATCGACCAT 9480 

GACTTCATGG AAGCGCTCCT GATAAGACTC ACGAACTTGT GGGAAATTCT CTAAAATCTC 954 0 

AATGGTGTAA TGGCTGATAT CAGCGAATTC GAAGCCATTT TCCTGTCGTT TTCTCTGACG 9600 

ATAAGCCTCT ACAAAATCGC TCATGAAAGA TTGGAAGGTT TTAGCTACTT TCCAAGTGTC 9660 

TCCATGATAA CGTTCTTGAT AGTCGAGAAT CGCTATCTGG TCTGATAATT GTCCTAGTTT 9720 

AGCAAACTGG GTCTTTCTCT CTTCGTTGTA GGCATCAGCC AGGGGCTTCA AATCAGCCTA 9780 

CGGCTGGCAT TAGTCAGAGC TCGACCGTTT TTCTCCTTAG AGATGGCGAC AACACGCGCA 9840 

AGCACTGCCT GATAAGCCTG ACTATCGGAC TCCTGATTTA GGGAGCCAAT TTCATCCAGA 9900 

ATTAACTGAA CATTTTCTAA ATAGGCAGCC TTTGCAAACT CCTTGGCATC GTTATCCAGA 9960 

TGGTAACGGA AAAAGCTTTC CAAATCCCAA ACGGCTTGTT TGATTTGCTC GGTCAGTTTT 10020 

TCTTTTTCAC TGGTAAAATC AGCTTTCTCA AATCCTTTGA GGAAAGATTC ACTCAGCCAC 10080 

TTTTGAGGAT TACTGGTGGA TTGGAGGAAG TCATAGATTT TATAGACCTG CTGGCGCAGA 10140 

CCCCGTTCGT CCTTGCCACG CCCAGCAAAG TTTTTCAGCA AATGACTAAA GGTCTCTTTC 10200 

TGTTTACCTT GGTAATGCGC TTCAAAGACC TCATGAAAGA CTTCGTTTTC GAGAATAAGT 10260 

TGCTCGCTTT GGTTTTGTAA AATACGGAAA TTAGCTGCAA TATCAAGCAG ATAACCATCT 10320 

TTGCCAAGGA ATTTTTGTGT GAAAGAATCC ATGGTTCCAA TGGCAGCGTT GCCTAGGTCT 10380 

GCCAACTGGC GACCCAAGTG TTGTTTGAGG TCGACATCAT CTGTTTCTTG GATTTTCTTG 10440 

CTGATTTTTT TCTCTAAACG TTCTTTAAGT TCAGTTGCAG CCTTGACGGT AAAGGTTGAG 10500 

ATAAAGAGTT GAGAAATTTC GACACCACGC GCCAATTGGT CCAGAATGCG CTCTGCCATG 10560 

ACAAAGGTCT TTCCAGAACC AGCCGATGCT GAGACCAGGA TATTCTGGGC AGAAGTGTAG 10620 

ATAGCTTCGA TTTGCTCGGC AGTTTTCTTC TGTTCCTTGC TCGAATTTGC TTCTGCTTCT 10680 

TGCAGTTTTT GAATCTCCTC CTCACTTAAA AAGGGAATAA CCTTCATCCA TTCAACTCCT 10740 

CTCTTATTTT TTCAAGCCAA GCTTGCTTGA GTTTTTCTCC GACCAGACGC TTGCCATCAG 10800 

CTAGGTCCAA CTTTTCTAGG AAACGGGCTT GGCCCAGATG GTAATTGGCT TCAAAGCCTG 10860 

TAATAGCCTG ATGTTGCTGG ACGTATGGGG CAATGCTTCT GCCATTTTCA GTATAAGGAT 10920 

TGATGGCGAA CCGGCCTGCT AAAATCTTCT CAGCAGCTTT CTTGTAAAGA TAGGCATTGT 10980 

AGTCCAGTAG GAGCTGAAAT TCCTCATCTG TCAGTTGATT AGCCTTGTTT TTGTTATAAA 11040 
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ATTCGCCTAA ATAACTGCTT TCTTTTTCCA AGAAGAGCCC TTGGTATTTC ATAGATTTGC 11100 

TGGCTTCTAC CACTGCTCCT GCCAGACTTT TTACCGCCAT CAGAGATTGG ACAGGTTCAG 11160 

CCATTTCCAA GTACATGGCG CCGAAAAAGT TCTGCTCCCC TTCTCTTTTT AGGGCAGCAA 11220 

GATAGGTTGG TAACTGAGAA TTGAGCCCAT TAAAGAAATG AGGAAACTGG AACTGAGTCA 11280 

GACTCGATTT GTAGTCTACT ACTCCTATCG CTCCATTAGC TTTCAAACGG TCAATCCGGT 11340 

CCACCTTGCC TCGTACAAAG ACACTGCGTC CATTGTCTAA TTGAATAAAG GCTTGGTCTT 11400 

TTCCACCAAA ATTTGCTTCT TCTTTGATGG TTTCGATGGC TGGATTCTCT CGGAGAATAT 11460 

GTCCAGTTGT CCGTGCAACA TCAAGCAAAA CTTCCTTGGT AAACTGGGCT TCCAAACTTT 11520 

CTTGATAAAT AGCTTCAAAT TCGCGTTCTT GACTGGTTTC TTGAATAGCT TGTTCTAGAC 11580 

GTTGGTCAAA GGAATCTTCA TTAGGCAACT GTAAGGCGCC TTCAAAGATA CGATGCAAGA 11640 

AATTCCCGTG ACTACGGGCA TCAGGATGCA AACGTAATTC CTCCTGCAAG CCTAAAACGT 11700 

AGCGTAGGAA ATAACTGTAT TCATTGCGAT AAAACTCTGT CAAACCCGAC GTAGACAGGT 117 60 

AAAACTCCTG TTTGGCAGGA TAGAGAGCTT GCAAGGTGTC CTTGGCTAAG GTCTTGCTGC 11820 

TTGGACTGGT TGGGATAGCT GGATTTTCCA GACCTTCCTG ATCTAGTTTT TTACCTATGA 118B0 

CACGCGACAG AACCTTGACA AAAGTCAAAT CTTGCTCAGT ATCGCTCATC TCACCCTGCT 11940 

GGTGATAGGC AACCAGACTA GACAAAAGAC TGTGATAGGA CCCCATATCC TCCTTAGACA 12000 

GTCCTTTGTG ATTCATCCTC TTCTCTCTCC GCCTAAATCC AAAATGGATC AACTCTTGAA 12060 

GATAGGCAGA TTCCTTACTT TCACTTTCGT TAAAAAGGCT TGGAGCCGAC AAGAACAACT 12120 

GCTTACGAGC AGAATTGACC AAGGAAAGCA TAGTGTAGCG ATTTTTCTTG ACATTTTCAC 12180 

TOTGCCAAT CAGT AATTG A ACGCCTTCTT CGGTCGCTTC CrrTr.v:GTT7 TGCCTTTCTT 12240 

CATCTGTCAG AAGACTGGTG TTTTGAGAAA TTTTTGGTAA ATTGTCCTGA GTTAGTCCAA 12300 

TAGCATAGAC AAAGTCAGCA GTCAATGGTG CAATCAAATC GTAACTCTGC ACCAGAACAG 12 360 

TGTCCACTGT TGCTGGAATG GTACGGTATT GGGACAAACT CATTCCAGAA TGGAGCAAGG 12420 

CTAGGAAGTC TTCCAGACTA ACCTGTGAAC CAGCAAAAAC AGTCGCAAAT TGTTCTAAAA 12480 

CATGGCAGAA AGCCTTCCAA ACTTCGGCTT GTCTTTCCTG TTCTACAGCT TCCAAAGTGG 12540 

TTGTCAAATC TTGTAACTGC TTGGTCACAG CTCCTTCTTT TAGAAAGACA CTCCATTTTT 12600 

GTAGGAGTTT TTCAGCCTTT TGTTTTCGGC TGGCAAAGAG GGTTTCAAGA GGTGCTAAAA 12660 

TTCTCAGGCG GAGGACATTC AAACGCTCAA GATTAAATTT TCCATGGTGG GATTTGGTGA 12720 

AGGTTTGCTG AAAGGCTGGC AAGCCATTGA TACCAAGATA GCGGATATAT TGCTCAAAAG 12780 



WO 98/18931 



PCT/US97/19588 



304 

CATCAATATC AGACTGACTG AGGTCAGTAT ACAAATCAGT TCTAAGAAGA TTAATCAAAT 12 840 

CCTCCTGACG AAAACGGTAA CGTTTTAAAG CTAAAATAGA CTCGACAAAC TGAGTCAAGG 12900 

GATGATGAGC CATGGCTTCG CTTCTACCAA GATAAAAAGG AATCTGATAC TGGTCAAAAA 12960 

TGGTTTTGAG AGATAACTGG TAAGAAGCTA CATCCCCCAA GAGAATACGA AAATGCTTGT 13020 

AGCTCAGGTC TGAGTTCTCA TGTAATTTCT GACGAATACT ACGGGCTACT AGCTCCAACT 13080 

CCTCCTTTTG CGTCAAACAA GACCAGATTT GTAAATTTTC ACGGTCTTTC TCATCGACAT 13140 

CCAAAGCGAG TTCTGAAAAG TCATAAGAAG ACTCCAACAA ACGAGAGGCC TTGTCAAAAC 13 200 

TATCCATCTT CTCATGAGTT TGAGAACAGT CCTGAGCAGG CGTTTGGTAT TTAGAAGCCA 132 60 

GATGATGGAG AAATTTTACG CTGGCTTGGT AGAGATTGCC CTCGCTAAAA GGACTGGTAT 13320 

AGGCTTTCTT ACTAGCATAA GCCCCGATAA CAATCTCAAC ACCTTTGCCG TGAAGTAAGT 13380 

CCACAACCCG CTCTTCCTCA GCAGAAAAAC GAGTAAAGCC GTCAATGACC AAGGCGATTT 13440 

GATTAAAATC ACTACTTACC TTGTCATTCT CAATAGCCTC AATCAAATGG GACAACTGAC 13500 

TTTCCTGGGC TAACTGACCT TGATTAAGAT AGGCTGTTAC TTTCTCAAAA ATCAAGAGTA 13560 

AATCCGCCCT CTTATCCTCA TCTGTTAAAT TCTCCAAGTC CAAAAAACTC ATCTGAGATT 13620 

TGGTCATCTC ATGGTAAAGC TCAATTAACT GCTGGATCAA TTGAGGATCC TGCTTAATAG 13680 

CGCCATAAAC ACGCAAGTCC TTGGGATCGA GTTCGGCAAG GCATTTCTAA AAGGCCAACC 13740 

CAAGACCGAT ATCATCAAGA GTAGTTTTAG CTGGTAAATC ATTCAAGACC AGATAGCGAG 13800 

CCATTTGAGC AAAGCGCGTG ACGGTAATCG AAAAAGAAGC CTGCTGGGAC AAGTATTCCA 13360 

GCACGGCGCG TTCCTTTTCA AAAGAAAGAG AGTTGGGGGC AATGTAGAAG ACCCGCTTGC 13920 

CAGCTGCAAC TAGCTCTTCT GCCTCTCTTG TTAGAATTTC TGTCAAAGAA GTCCGAATAT 13980 

CAGTATAAAG TAATTTCATC TCAGCCTCGT TGGAATTTTT CATCACCCTA TATTATACCA 14040 

TGATTAGCCT CGTAAATCTG TTAAAATATT TAGGCCATCC TTTCTTTTCT TCATCATCTG 14100 

CTAAATCTTA AATACTTAGC TTTACTTGTA TTAGATAGAA TAAGTCTGGC TACTGAAAAT 14160 

CACATAATAA AAAAGCCTCG GTAACAAGGC TTTGAGTTTT ATGATTGTTT CTTAGGTACG 14220 

GAATACACTT CAATGTGTTG TCCCAGTATC TTAATGTCGA CTGGTAGATT GTCTGATTTA 14280 

TCGCCATCAA CATCGGACTC TAATTCGATA TCAGAAGAAG TTTTAATATT ACGTGCCTTT 14340 

ATATATTCAA TATTCTTGAT AGAATGATTG AACTATAGTA AATTGAAACT ATAATAGTAC 14400 

ACCGTGGATG CTAAAATATT TCTAGAAATT AATTTGATTT CCCTAATCAA GCTATTCGTA 144 60 

TCTTATTTCA ATCTACTATA ATAAAATGAA CCAAAAATAG TACACAATGT GGTATAATCT 14520 

TCTTATGGCA TATTCAATAG ATTTTCGTAA AAAAGTTCTC TCTTATTGTG AGCGAACAGG 14580 
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TAGTATAACA GAAGCATCAC ACGTTTTCCA AATCTCACGT AATACCATTT ATGGCTGGTT 14640 

AAAGCTAAAA GAGAAAACAG GAGAGCTAAA CCACCAAGTA AAAGGAACAA AACCAAGAAA 147C0 

AGTTGATAGA GATAGACTTA AAAACTATCT TACTGACAAT CCAGATGCTT ATTTGACTGA L47 60 

AATAGCTTCT GACTTTGGCT GTCATCCAAC TACCATCCAC TATGCGCTCA AAGCTATGGG 14820 

CTACACTCGA AAAAAAGAAC CACACCTACT ATGAACAAGA CCCAGAAAAA GTAGCCTTAT 14880 

TTCTTAAGAA TTTTAATAGT TTAAAGCACC TAGCACCTGT TTAGATTGAC GAAACAGGAT 14940 

TCGATACTTA TTTTTATCGA GAATATGGTC GCTCATTAAA AGGTCAGTTA ATAAGAGGCA 15000 

AAGTATCTGG AAGAAGATAT CAGAGGATTT CTTTGGTTGC ACGTCTAACA AATGGTGAAT 15060 

TAATCGCTCC AATGACTTAC GAAGAGACGA TGACGAGCGA CTTTTTTGAA GCTTGGTTTC 15120 

AGAAGTTTCT CTTACCAACA TTAACCACAC CATCGGTTAT TATAGTAAAA TGAAATAAGA 15180 

ATAGGGGGGG GGGGGGAGGG GGGGGGAGGG AGA 15213 



4 {2) INFORMATION FOR SCO ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6004 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TTATTACCTG AAACATTAAA TTTAATTGGA CATCCCGTTA TCAATTTTAT AATATCATCA 60 

AGATTTTTAT TATCTGATTC AGGAATTTTA TCTGATATAA CAACACCATT TTCAAGATAG 120 

TTCATTAAAT TATTTGATTC ACTAACATTA GTGTTTTGAT CTCCATCAAC CCAAAAATAA 180 

TGGTTATCGG AATCTAAATA CGATGAGTTT AAAATATTAT TACAAATTAT TTGATTTGCT 240 

CCACCAGGAA TATATCTCAC TACTAAATTC TGTTTAAGAT TCTCACTACC TGAATGAGTG 300 

ATAACAAACT CTAGAATATA TTTAGCTAGT CTATCTTCAA CATAAATCAT CTTCCTAGAA 3 60 

TGATACACAT CACCTAATTC AAAAAATGCA TCCTGATAAT CAATATTTTC AATAACATCT 420 

ACCTTTTCTC CGTTTTTCAC TAAAAGTTTC ACGGCTTCTC TAGGAAAATC TTTTATAAGT 4 80 

TGTGTAGAAT GTGTAGTGAT AATAATTTGA TGTTTTTTAT TTAAACACTC TTGAAGTAAA 54 0 

AACTCTTTAA ATTTATAGAT TGCACTCGGA TGAAGTGAGA TTTCAGGTTC ATCTATTAAT 600 

ATTAATGAAT TTGATTGCGC ATTTACTATA TCATTTACTA ACAAAATAAT TCTAGCCTCA 660 

CCTGTTCCTG CAAAAGCCTC GGAATATTCT TTTCCAGATT TTTTCATCCA AATAGTTTTG 720 
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GAAGCTTTTA TATCATCACC TTTTGAATAC AACTTATGTG TTAAAATTTG AATGTCTGTA 780 

TAAGATTCAT CCATTATTTC ACTAATAATT TCACAAACTT TATCATCAAC TTTAACATTA 84 0 

TCTATAACCA TTTCCTTTTT ATAACGCGTA TAGCTACTTG TATTATTCTT TAAAATATCA 900 

GCAACTGGCT TAGATCGTAA TCTTATAAAA TCTTGTTTAC TACGTTGAGT AGAAATTTTT 960 

TTAAAATTAT AGTGATAGAA AAATAAATCA AAAGCAGAAA CATATTCTTT ACAATCACAA 1020 

AAGACAACAT TTTTTTCAAT GCCATCCCAT CTGTCTGTCG AAGAACTTCC AATATATTTA 1080 

TTTTTGGGTA ATCTTTCCAT CTCATATTGT TTTTGAGGAG CATATGGTTC CCAATAATCT 1140 

AATCCTTTTT TTGTTCCAGA ACGGCCTTTA AGAACTTCTA CATTTCTAGA AGCTTTAATG 1200 

TTATAATATG AATAGATTAA ACATTGTTTC CCATCCACTT CATCTATTTG ATCAACATTT 1260 

GTACTAAACC AATATTCAGA CACACTTTTA TTGGCTGGAG AACCATATAA AGCTTGTAAA 1320 

ATTGAAGTTT TATTTACTCC ATATCTATTA CAGACACCTC AGGATTATTT AACTTATAAG 1380 

TTTTAACAGC TACGGAATCA ATTTCAACAG CAACTTGAAC ATCTATGCCT GATTTTTTAA 1440 

GGCCACTTGT AGTGCCACCT GCACCGTTAA ATAAATCAAT AGCAACAATT TTCCCCATAG 1500 

TATTCTCCTA AAGTTTCTCC TTTTTATTAT AACATTATCA AATGTAAAAC CCAACCCGAT 1560 

AGGGTTAGGT TTTTAACATC ATTTCACCAA CTTCTTCATC TCATCAATAC GTGCGACGGT 1620 

CGCGTCATAT TTAGCTTGGT ACTCAGCTTG TTTGTCGCAT TCTTTTTGGA CGACTTCTGG 1680 

TTTGGCGTTG GCTACGAAGC GTTCGTTAGA GAGTTTCTTA CCAACCATGT CCAGTTCTTT 1740 

TTGCCATTTA GCAAGTTCCT TGTCGAGACG GGCCAGTTCT TCTTCAACAT TGAGGAGATC 1800 

GGCCACTGGC AGGTACATTT CTGCTCCTGT CATGACACTT GACATAGCCA GTTCAGGTGC I860 

AGGGATGGTT GATGCGATTT CCAAGTGTTC TGGATTTGTA AAGCGTTTGA TATAGTTGAC 1920 

ATTGCTGTTA AAGAAGGCTT CCAAGTCGCT ATCGCTTGTC TTAACAAGGA TGGTGATAGG 1980 

CTTGCTTGGT GCTACATTTA CTTCCGCACG CGCATTCCGA ACAGCACGAA TCAAGTCTTT 204 0 

GAGACTTTCC ACACCAGTGT GAGCCGCAAG GTCTTCAAAG GCTAGATTAA CAGTTGGGTA 2100 

TCCAGCTGTC ACGATAGAAC CTTCTGAGAT TTGTCCAAAG ATTTCCTCTG TCACGAATGG 2160 

CATCATTGGG TGAACGAGAC GAAGGATCTT CTCCAGCGTA TAGAGGAGAA CAGATCGACT 2220 

AATGACCTTA TCGTCTTCAT TGTCGCTGTA TAGAACTTCC TTGGTCAACT CAACATACCA 2280 

GTTGGCAAAT TCTTCCCAGA TGAAGTTGTA AAGGATATGA CCAGCCACAC CAAACTCGAA 234 0 

CTTATCAAAG TTTTCAGTAA CTTTTGCAAT GGTTTCGTTG AGATTGTGGA GAATCCAGCG 2400 

GTCCGTCACA TTACCAGCCT CACCTGTTGC AACTTTTGTG ACATTGTCAT GCGCCACATC 2460 

CAGCGTCAAA CCTTCATTGT TCATGAGGAT ATAGCGAGAA ATGTTCCAAA TTTTGTTAAT 2520 
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AAAGTTCCAT GAAGCATCCA TTTTCTCGTA AGAGAAACGA ACGTCTTGAC CTGGTGCGGA 2580 

ACCGTTTGAA AGGAACCAAC GAAGGGCATC AGCACCGTAT TTCTCGATGA CATCCATTGG 2640 

GTCAATCCCG TTACCGAGAG ATTTAGACAT CTTGCGTCCT TGCTCGTCAC GGATGAGACC 2700 

GTGGATAAGC ACGTTTTGGA ATGGCTGACG ACCAGTAAAT TCCAAGGACT GGAAGATCAT 2760 

ACGAGACACC CAGAAGAAGA TGATGTCGTA ACCTGTTACC AAGGTTGAAG TTGGGAAATA 2820 

ACGTTTAAAG TCTTCTGAGT CGACTTCAGG CCAGCCCATG GTTGAAAATG GCCAGAGGGC 2880 

AGAACTGAAC CAAGTATCCA AGACGTCTTC GTCCTGAGTC CATCCGTCAC CTTCTGGAGC 2940 

TTCTTCGCCG ACATACATTT CACCATCAGC ATTGTACCAG GCAGGGATTT GGTGACCCCA 3O00 

CCAAAGCTCA CGAGAGATAA CCCAGTCGTG GACATTTTCC ATCCATTGAA CGAAGGTATC 3060 

GTTGAAACGA GGTGGGTAGA ATTCGACCTT GTCCTCTGTG TCTTGGTTAG CAATGGCGTT 3120 

CTTAGCCAAT TGGTCCATCT TGACGAACCA TTGAGTAGAC AAGCGTGGCT CAACTACGAC 3180 

ACCTGTACGT TCTGAGTGAC CAACACTGTG GACACGTTTT TCGATTTTGA CAAGGGCACC 3240 

GATTTCTTCC AACTTAGCAA CGACTGCCTT ACGAGCTTCA AAACGATCCA TGCCTGAAAA 3300 

TTCAAAGGCA AGCTCATTCA TAGTTCCGTC GTCGTTCATG ACGTTGACTT GTGGCAAGTT 3360 

ATGACGTTGG CCAACCAAGA AGTCATTTGG ATCGTGGGCA GGTGTGATTT TCACGACACC 3420 

AGTACCAAGC TCAGGATCTG CGTGCTCATC TCCAACGATT GGGATGAGTT TATTAGCGAT 3480 

TGGAAGGATG ACGTTTTTAC CAATCAAGTC CTTGTAGCCC GGGTCTTCTG GATTAACCGC 3540 

AACCGCAACG TCCCCAAACA TACTCTCAGG ACGAGTTGTA GCAACTTCAA GGGCGCGTGA 3 600 

ACCATCTTCC AGCATGTAAT TCATGTGGTA GAAGGCACCT TCTACATCCT TGTGAATCAC 3660 

CTCAATATCA GAAAGGGCTG TCCCACCTGC TGCGTCCCAC TTG.VT^ATAA ACTCACTACC 2^20 

ATAGATCCAG CCTTTCTTGT AAAGGTTCAC AAAGACCTTA CGAACAGCTT TTGACAAACC 3780 

TTCATCAAGA CTGAAACGCT CACGAGAATA GTCTACAGAA AGCCCCATCT TGCCCCATTG 3840 

TTCCTTGATG GTAGTGGCAT ATTCGTCTTT CCATTCCCAG ACCTTCGTCA AGAAACACTC 3900 

ACCACCTAGG TCATAACGCG TAATACCCTC ACCACGTAAG CGCTCCTCAA CCTTAGCCTG 3960 

AGTCGCAATA CCAGCGTGGT CCATACCTGG AAGCCAAAGG GTATCAAAGC CTTGCATGCG 4020 

TTTTTGACGG ATGATGATAT CCTGCAAAGT CGTATCCCAA GCGTGACCAA GGTGAAGTTT 4080 

CCCAGTTACG TTTGGTGGTG GAATCACGAT TGAATAAGGC TTAGCCTTTT GATCGCCTGA 4140 

AGGCTTGAAA ACATCCGCAT CAAGCCATTT TTGGTAACGA CCAGCCTCAA CCTCGGCTGG 4200 

ATTGTATTTA GGTGAAAGTT CTTTAGACAT GTGTGTGTCC TTTCTCTATT TTGTTTATTT 4260 
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TATTTTGAAT TTGCTTAGCA GCTTCTTCTG CAGACAAATT CGTATTATTT ATTTTAAAGT 4320 

AGTGGTGCAA CTCATTCCGT TGATGTTGGG AATTTAATTG AAGTGTTTCA CCGGTCTCTA 4 380 

AAATTTCTCT TTCAGATACC TCAATATGTC GTTTTAAGGG TTTGTGCTTT AATCGATTCT 4440 

CCCTTCGATT TCGACGTATG CACTCTTCAA GACTTGTTTC CAATTCAACA AACAGAATCT 4500 

CTTGATGAAA GTTATCCAAT AAATCCTGAA TTTGCTTTAA ATACATCAGC TGGTACTGAT 4560 

TTGAAAAATC AATTACGTCT GTTAAAATTA CTGATCGCTG ATTTCTTGCA CTTGCTCCAA 4620 

GGAAAGAAAA GGTAATTCCA CGAACAAATT CCCACATCTC CTCGGTATAA TCCTGATAGA 4680 

TCTCTAGTGC AAAATCAATG GCTTGATGGT TA7AAAATAG GGTAGCATCC GTCAGTCGAG 4740 

ATAATTCTTG ACCAATGGTC ATTTTTCCTG ATGCTGGAGC ACCAATGATG AAAAGATGCA 4800 

TCAAATCACC TCCCACTCAC TCCTCAGCAA GCCATATCTC AAATCATCAC AGCAGTTGCC 4860 

TTGAGCATCT TTGCGGTCTC TTATCCGAGC TTCGAGGGTA AAGCCAAGCT TTTCCGAGAC 4920 

TCGTTGACTT TGAAGGTTAT ATCCAAAGCA AGTTAGTTCA ATCTTGTGAA GACCAAGTTC 4980 

TTTAAAAGCT AGATCAATCA AGGAACACGC TGCTTCTGGA ACATAACCTC GACCCCAATA 5040 

GTCTGGGTGC AAGGTATAGC CAAGCTCTAG CACATCATCC GCATGAAGAT GGTTGAAGTC 5100 

AACAGAACCA ATGACTTTAT CGGTTCCTTT GACGACAATC CCATAGCCAG CTGGGAGATT 5160 

TTCCTTTTGA GTACGCTCCG GAAGAATGTG CTCCAGATAA TAAATCTCAT CTTCCAAGAT 5220 

CTTGACTGGA GGAAAACCTG CTGGATAGGC GACCTCTGGC AAACTAGCGT AGGTATGGAT 5280 

ATCCTCAGCA TCCACCACTG TGCGGACTCG TAAAACGAGA CCTTCTGTTT CGATTTTATC 5340 

TGGCAGCTCA GTTCTTGCCA TCCTTCTTCC TCGCT T TT TT GATGAAACTG CCCTTCATAT 5400 

CTACACGCTT GTCCAGATAG CGATAAACGC GCTGATATCC ATCTCCCATG AAATAGGTTG 5460 

GGGCAAACAG TTGATTTTTA AAATGTCCCT TTTCATCCAG GAGTTCTGGG GCAACAAGTC 5520 

GCTCAAGAAT CTTGGCAAAG ATGTGGCAAA TACCGTCTTC CTCAACAATC CTATCTACCC 5580 

GACAATCTAA AACAAGTGGA CAGGCGTCTA AAATAGGAGT CTGAGTTCGT TCAGAAAT IT 5640 

CATAATGCAC TCCCAAACGT TCCAATTTCT CCTGATGACT GATAAAACCA GCCTGCTCCA 5700 

TCGCAAGCAT AGAAGTTTCA TCAGAAAT AT TCACAGTAAA TTTTTGATAC TGTTTGATCT 5760 

GCTCTGCGGC ATTCTCTCTC GCAACGACTC CAATCACAAC CCAATCTCCT AGACTATAAG 5820 

AGGAACTACA GGTCGTGATG TTATAGCCAA AATTCTAATC TTGATATCCT AAAATAAAAA 5880 

CAGGAAAACC ATAATATAGT TTACTTGTGT TAAAAGATTG CTTCATAACA ACCCCCTTTG 5940 

ACTAAGACGT AAAAGAAAAG CCCTGCCATC TACATGACAG GGACGAATGT GTTTATCCGC 6000 

GGGG 5004 
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(2 ) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5857 base pairs 

(B) TYPE: nucleic acid 
|C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xii SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



TGTAGAATTC 


ACGACAATGC 


TTCGTTGATT 


TCTGGGTTGA 


TTTCGTCGCG 


TTCTGGCAAG 


60 


CGAGTCAATG 


AACCAAAAAT 


AGTACACAAT 


GTGGTATAAT 


CCTTTTATGG 


CATATTCAAT 


120 


AGATTTTCGT 


AAAAAAGTTC 


TCTCTTATTG 


TGAGCGAACA 


GGTAGTATAA 


CAGAAGCATC 


180 


ACACGTTTTC 


CAAATCTCAC 


GTAATACCAT 


TTATGGCTGG 


TTAAAGCTAA 


AAGAGAAAAC 


240 


AGGAGAGCTA 


AACCACCAAG 


TAAAAGGAAC 


AAAACCAAGA 


AAAGTTGATA 


GAGATAGACT 


300 


TAAAAACTAT 


CTTACTGACA 


ATCCAGATGC 


TTATTTGACT 


GAAATAGCTT 


CTGACTTTGG 


360 


CTGTCATCCA 


ACTACCATCC 


ACTATGCGCT 


CAAAGCTATG 


GGCTACACTC 


GAAAAAAGAA 


420 


CCACACCTAC 


TATGAACAAG 


ACCCAGAAAA 


AGTAGCCTTA 


TTTCTTAAGA 


ATTTTAATAG 


480 


Hi t\j\J\\J\~f\\~ 


PTAA.CACCTG 


TTTAGATTGA 


CGAAACAGGA 


TTCGATACTT 


ATTTTTATCG 


540 


AGAATATGGT 


CGCTCATTAA 


AAGGTCAGTT 


AATAAGAGGC 


AAAGTATCTG 


GAAGAAGATA 


600 


TCAGAGGATT 


TCTTTGGTTG 


CAGGTCTAAC 


AAATGGTGAG 


TTAATCGCTC 


CAATGACTTA 


660 


CGAAGAGACG 


ATGACGAGCG 


ACTTTTTTGA 


AGCTTGGTTT 


CAGAAGTTTC 


TCTTACCAAC 


720 


ATTAACCACA 


CCATCGGTTA 


TTATTATGGA 


TAATGCAAGA 


TTCCATAGAA 


TGGGGAAGCT 


780 


AGAACTCTTG 


TGTGAAGAGT 


TTGGGTATAA 


ACTTTTACCT 


CTTCCTCCCT 


ACTCACCTGA 


840 


GTACAATCCT 


ATTGAGAAAA 


CATGGGCTCA 


TATCAAAAAG 


CACCTCAAAA 


AGGTATTACC 


900 


AAGTTGCAAT 


ACCTTTTATG 


AGGCTTTTTT 


GTCTTGTTCT 


TGTTTCAATT 


GACTATATAA 


960 


ATTGTCTAAG 


CGAAACAACC 


GATAAGAATT 


GGCACAAAAG 


CGACCGTATT 


TTTGTTACCA 


1020 


ATACAGGAAA 


AACAGTTCAT 


AGTTCTATCT 


TGAGCAAGTC 


TCTCCAGCGA 


GCAAACGAAC 


1080 


GCCTTAAAAA 


ACCAATTCCC 


AAACATCTGT 


CCCCTCACAT 


CTTCAGACAC 


ACCACTATTA 


1140 


GCATCTTATC 


AGAAAATAAA 


ATTCCTTTAA 


AAACAATCAC 


GGACAGGGTT 


GGTCATCCCG 


1200 


ACTCTGAAGT 


CACTACTTCC 


ATCTACACCC 


ACGTCACAAA 


GAACATGAAA 


GATGAAGCAA 


1260 


TCAATGTACT 


GGATAAAGTT 


ATGAAAAAGA 


TTTTTTAAAA 


AGTTTTGTCC 


CTTTTTTGCC 


1320 


CTCTAAATAC 


AAAAATAGCC 


CTTCGGATAA 


AATCCGAGGG 


GCTAGAAACG 


TTGTTAAATC 


1380 
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AACGGCCGAA CTTTTCAATT TCATGGTTCG GGATAAAATA GTTCACTGAA CTATTTTATT 1440 

TTTTAAGGTT ATCATAATAT CAAATAGTTC AATTAAATAC GCTAAATTAC TAATATACTT 1500 

TTTACCTTTT TCATTCTAAA ATGTAAAGTA CAAACAATTA CAATATACTA GAGGGGGAGT 1560 

AAAAAAGGTA TTAAATCGAT GAGTTCAGCA GGCAAGAAAA TAGCACCTTT ACGGGTGCTA 1620 

TTTTTTAATT AACGCCACGT TAACTTTTGA TTGATGAATT TTATTGTTTG GCACTTCTTT 1660 

CATTTCACGG TAAACATCGA TGAAATTCTT TCCAACATTA TTTTTGGAGT TAACTGCATT 1740 

TATTTTTGTA TTAATAACTT TTTTAGTATC GAAAGAATGG TTTAAGAAAT CCATAACTAA 1800 

CTCTCCTTTC TCATCCTGTA ATCAAGATTT TTATCAATGT CAAAATAGTA TTTTCTATCA 1860 

ATCCAAATTG GTCCTTCTCC TTTAGAAATA GCAAGTACAT CTACCGGACC TCCTACTGTT 1920 

TCAAGAGTGT TGACAATTTT TCTCTTAAAT GAAGTTAATT CAATAAATGT TTTAGCTGTA 1980 

CTCGCCATTT CATTAAGTGG TTGCATTCCA ATAAGGTCTA TTATAGGATT TATATAATAT 2040 

TTTTGCTGTA TAGATGATAT ATTTTCAAAT ATATTCTCAA TTTCATCACC CAATCCATTT 2100 

TTCTCCATAA CTGATGATAC TTGCTCTGCG ATATATACAT TTAAGTTAGG ATCTATACCA 2160 

TTCATAATCG TCTCAACCAT CTCTGACTGT GCAAAAGGGA TTATATGACA AGTTTTATGA 2220 

TGATTTATCA CACTTTCATT AATAACTTTC CAAATTAATC GTTTAGAAAA AATTCCATAT 2280 

AATTCAATTT GTCTTATAGA TGGAAATATC TCGTCTGTAC CATAACCTGC TATAACTAAT 2340 

CCAGTTATGT TTGTTGAGTC ATATCCAATG AAAATCGCTT TATATAAAGA TTTAGCAATA 2400 

ACTTCAACCT CATCATCAGT ATGAGGAAAG GATTTAAAAA CATCGTCTAC AATGCTTTTT 2460 

ATTAACTCTA ACTCAGCTTC AAAAAATTCA AAATTACTTT CAGCTTCTAC TTTTGAAATT 2520 

TCTAAACTAA AATTAGTTAT AGCATTTAAT AAAATTTTAT TAAAATCATC TAGAGTGATG 2580 

GTTTCACCAT TAGAAACTCT TAAATCAGCT GTTTCTTGCG CTTCATAGGC AATGCTGTCC 2640 

AAAATACTTC TTGTACTTCT GACAATATAA TTTCTTAATA AATCCTCAAC TTGTAGATGT 2700 

TTAAAGGAAA TTAAAAATTC TATTAGCTTT TCAACGTATT GGGCAGTATT ATCTAATAAA 2760 

TCTGTGCCAA TAGCCTGCTT AAACTCATTT AAAATTACCT CCCACGGAAT TTCCATAAAC 2820 

GAAGCGTTCC CATATATCAT GATCCCCACG GAATGTTCTT TTGATAAAGT GAATAATTTT 2880 

CGGGCGCTAT TAAAAACTTT TGAATTTTTC CCGTCTGATA AGGTTACAGC GCTATCAGAA 2940 

GCCAATACAA CACCATTTTT ATTTAATATT CCAATTTCTG CTGTCAAAAT ATCACCTAAA 3000 

CTTTCTAAAC CTGCTCATGC TCTAATGGTA CAACAGCTAA GGTCTTACCA AGACTTGCCA 3060 

ACACTTTTAA TACTGTATCA AGTTGTGGGC TTGTCTTTCC TGTTTCCATT CTAGCGATAA 3120 

CTGGCTGACT AACACCGCTC ATCTCCTCTA GTTTCTTCTG ACTAATACCC TTTTCATTTC 3180 
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TAGCCTCGAT 


AAGCTCACTC 


ATGATAGCCA 


CGCGCATATC 


ACTTTCCAAA 


ATTTCCTCTT 


3240 


TGCTGAATAA 


TTCAGCTCTT 


ACATCTTTCC 


AGTTACTACC 


AATAGCATTA 


TTTTTCATTG 


3300 


TCTAAACCTC 


TTTCTTTTAA 


ATCTGCAAGT 


TCACGTTTAG 


CTTGCTCAAT 


CTCTCTTTTG 


3360 


GGTGTTTTCT 


GTGTCCTTTT 


CATAAAATGA 


TGCAGTAAAA 


CAAAACTACC 


ATCCATCCAA 


3420 


GCAACAAATA 


AAATTCTATC 


TCTAAGTGGT 


CTCAGCTCCC 


AAATTTCAGC 


ATCTAAATGC 


3480 


TTAATATATG 


GTTCGCCTGC 


GCGTGTTCCA 


TGTTGGCTTA 


ACAACTCAAT 


ATAATCATTA 


3540 


ATTTTATTAA 


GCTTAATTCT 


GCTATCTTTC 


CCTTTTTTAC 


TGGTAAGCTC 


TCGCATATAA 


3600 


TCAAAAACAG 


GCTCATTGCC 


GTTTTTATCC 


TTGTAAAAAT 


AGATATTATG 


CACTATTAAC 


3660 


ACCTCTTCCT 


AATAACAATT 


ATAACCTAAA 


AGTTATTGTT 


TCTAAATACT 


TTTAAGTTAT 


3120 


TAAAATAAAA 


AGCACCTAGT 


TTCCTAGATG 


CTACCACAAT 


GACACGGATT 


CGCACCGTGG 


3780 


CTACCTCTAT 


CAAGGTGTAC 


TCCTTCTATA 


CTATCCCTTG 


TGCTTTAGAA 


TATTATACCA 


3840 


CACAATCAAC 


TAGATACCTA 


CCATCTCATG 


ATATACCCCC 


ATTTTGGGCA 


AGGGTACAAC 


3900 


GCTAAAATAC 


AAATCAGAAT 


AG AT ATT AAA 


CCACTTATTT 


AACTTATCAT 


AAGCTGGTGA 


3960 


TTGACTGATA 


AATAATATCC 


GCTGACAAGC 


TCCGATAACA 


TTCATGTGAT 


TGTACACATA 


4020 


AACCTCTTTT 


ACAGCCTCTA 


AAATGTCAGC 


CTCACTTGTT 


TGTACCCTAA 


TATCTGTTAT 


4080 


CTGCTTGATA 


GTTGCGTATT 


TTTGATAAGC 


TAGCATATCT 


TGATTTTTAG 


CAGCATCAAA 


4140 


CATTTTACGC 


TCAAGGACAC 


TATACTTAGG 


TTGTTCTTTA 


TCTCGCATGA 


AATACCACTT 


4200 


GAGCCATAAA 


ATCTTTTCTC 


GGTGTATTAC 


ACAAATACGC 


TCAATTTTCT 


TCTTTGTCAT 


4260 


TGCTACCTCC 


TAAATCATCA 


ATTTAACAAT 


TCTAACCACT 


CACTTTTAGA 


AATAGTTGCA 


4320 


TAGATCTTGT 


TCGATGTATG 


ATACAAAGGT 


TCTAAATCTT 


TTTCCACCCT 


AATATAGTTC 


4380 


ATCTTATCC7 


CATGAGTAGG 


AAAGTATAGT 


ATTTCCCTTT 


CATCCTCGTT 


TAGGATACGA 


4440 


TTGCACCAAT 


CATCAATAAT 


AACTGGCACT 


TCCCACTCAC 


GCCATTTTTT 


AAGGTTTTCT 


4500 


AAAAGTTCAT 


TATCACTAAA 


TAGCTCGCCA 


TCTATTTGGA 


AAAATTCCCC 


TAAGTCATTG 


4560 


TTTCCTTCAA 


CAATAATAAA 


CTCTGGCATA 


TTTCTATTAC 


TTAATAACTC 


CTTGAGTTCT 


4620 


TGTAACTCTT 


TGATTTCCTT 


TAGATACTTC 


CTCAATTTCC 


AACCTCAATT 


CTTCAATCTG 


4680 


CCTTACTACT 


CCAAAAATTT 


CATGGGTCTT 


ATAAGATTGT 


TCAAGTATAG 


CCTTTGCTGC 


4740 


TTGAGTTCTT 


ATAAACGGGT 


TGACCTTACT 


GTCCATCATA 


ATATCATTGA 


GTACAGAAAC 


4800 


AGCGTTAGAT 


GATGCTAAAT 


AAAGCATTTG 


AGTTGTTTTA 


TCCATCATCT 


CATCTTGCTT 


4860 


TATCCTCAAT 


GTCTTTTTAA 


CCGCTGCAAC 


TTTTAGATAC 


TTATGACCTG 


TTGCGCGTGA 


4920 
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TACCCCTGCT TTTTGACATG CTTTGTCTAT CGTTGGCTCG GTAAGCATGG CATCTATGAA 

TTTAATTTGC TTGGACGTAA GGTTATCATT TTCATTTCCT GCCATCTATT ACCTCCTCAT 

TATCAAAATA AAGGGTTGCC CCTTTATTTC CCTATGCTAG ATAATTCTGC AATTCTGCAT 

CCATTGCCTC TGAATTGCCC TCAACAATCA TTTCATGCTG TACTAAATCA ATCTTATCTC 

CGTTAATAAG TAAACCACCG TGGAAATAAT CAATTTTTCT ATCAAGGAAA TGTACTAGCT 

TTTCAAGGCG TTGCTGTTGG CTGAATTGCT CCATGTCAAT TTCGATATAA GCAAGGGTAG 

TATCATTATC CATAATATCT TCTAATTTTC TAAGAGCTAG AGGTTTATTT TTATATTTTT 

CTAGGTATTC TCTCATTTCT GCCACTGTTA ATTTGATACT AGATAATAAA CTTAGTTCAG 

CTGCATCATC TGCTGTAATA GGCTCTTCTT TTGATTCATG GTTTGCTAGT TCAGCATTTT 

TCTCTTTTTC TAGTTGCTGA TACAATAGCT GAGCAGTATT TTGGGAATAG TTTTCGCCCT 

CTTTTTTATA TTTTAAAAGT TCTTGCTCTG CATACACTTT CCCGATAATC ACTTCCTTAT 

AAACTAATTG CCCATCTTGA GCTTTTAGCT TAATACTCCC ATGCTCTGGA ATTTCAATAT 

ACTTAATTAT ACCATTTTTT GAGTATAAAA CAAAGCCTTT CTC CATC ATT TTTAATAATT 

TATCATCCTT GTTTTCAGTC ATGCTTTTCT CCTTTATTTC ATTTTATTAT AATCTGAATA 

CCCCTAGTCT ATTTATTTCA CTAGGTTTTT AGGGTTCGTA TGCTAAAATA CTACCCTTTT 

TGTGTACCTT ATGGCTGACT TTTCAAATTG GTTAGTT 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10254 base pairs 
(3) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY : linear 



4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
S580 
5640 
5700 
5760 
5820 
5857 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
AAAATGATAG CAGGAGAGTT TTCCCGTCCA TCAGACCCAG AACTGAGAGC CTTAGCTCAG 
GCTTCTCGCC AAAAACAGGC CGCCTTTAAC AAGGAAGAGA ACCCCTTGAA GGGAGCCGAA 
ATCATCAAGA CTTGGTTTGC CTCAACCGGG AAAAATCTTT ACATCAACAC TCGCTTGATG 
GTGGACTACG GTGTCAACAT CCATCTAGGG GAAAATTTTT ATTCTAATTG GAACTTGACC 
ATGCTGGATA TCTGTCCCAT TCGTATCGGG GACAATGCTA TGATTGGTCC TAATTGTCAG 
TTTTTGACAC CCCTCCATCC ACTAGATCCA CAGGAACGCA ATTCAGGTAT CGAGTACGGA 
AAGCCTATCA CAATCGGAGA TAATTTCTGG ACTGGTGGTG GCGTCATTGT CCTTCCTGGA 
GTGACACTGG GAAATAATGT CGTTGCAGGA GCAGGGGCAG TAATTACCAA ATCTTTTGGC 



60 
120 
180 
240 
300 
360 
420 
480 
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GACAACGTTG TCCTAGCTGG CAATCCTGCG CGCGTGATTA AGGAAATACC TGTTAAATAG 540 

AAGTAAAAAG GAACAGCTGG GGTTGTTTCT TTTTTGTAGG TTTCATCATT TTTTACCCAG 600 

TTCACATTTA CCTACTCTAT CTCTTAGCAA GTCTGTTTCA TTAAGCAAGT TCAAAGCATC 660 

TCGTAAGTGG GATGTTTTTC TCCTCAGTTC ATCAGCTTCC TCCTTGACAC TCGGTCAGAT 720 

TTTGATACAA TAGTACAAAA TTAGAGGAGG CAGGCTATGA TTCAGAAACA TGCGATTCCT 780 

ATTTTAGAGT TTGATGACAA TCCTCAGGCG GTTATCATGC CCAATCACGA GGGGCTGGAC 840 

TTGCAGTTGC CAAAGAAGTG TGTTTATGCA TTTTTAGGTG AGGAGATTGA CCGCTATGCG 900 

AGGGAAGTAG GGGCGAACTG TGTTGGCGAA TTTGTTTCTG CCACCAAGAC CTATCCAGTT 960 

TATGTCGTGA ACTACAAGGA CGAGGAGGTC TGTCTGGCTC AGGCTCCTGT TGGCTCCGCT 1020 

CCAGCAGCCC AGTTTATGGA TTGGTTGATT GGCTATGGTG TGGAGCAGAT TATCTCTACT 1080 

GGGACCTGTG GTGTCCTAGC TGATATAGAG GAAAATGCCT TTCTAGTCCC TGTTCGCGCT 1140 

CTGCGAGATG AAGGAGCCAG TTACCACTAT GTGGCACCTT GTCGTTATAT GGAAATGCAG 1200 

CCAGAGGCTA TTGCTGCTAT TGAGGAAGTT TTGGAAGACA GAGGGATTCC TTATGAAGAA 1260 

GTCATGACCT GGACGACAGA CGGTTTTTAC CGAGAAACGG CTGAAAAGGT GGCTTATCGT 1320 

AAGGAAGAAG GCTGTGCTGT TGTGGAGATG GAGTGTTCTG CTCTTGCGGC AGTAGCTCAA 1380 

TTGCGTGGGG TTCTCTGGGG TGAATTGTTG TTCACAGCAG ATTCTCTAGC GGACTTGGAC 1440 

CAGTACGACA GTCGTGACTG GGGCTCGGAA GCTTTTAATA AGGCGCTAGA ACTGAGTTTA 1500 

GCAAGTGTTC ACCACCTTTA GTTGTACTGG CAAAGGATTT GTTTTATCAT AAAATGTCTA 1560 

GCTCATACTT TTCAAAAATA TGTTTAAACG AGGTCACCTT CCTCTTGTCC TAGGCATGTT 1620 

GAGGTTGGGA AAAATCTTTA AAATCAGAAA AACGTATCAT ATCAGGTGAT GAAAACTTTG 1680 

ACACTATGCG TTTTATGTCG ATAAGATTTA GACTGAGATG AAATGATACT CTTCGAAAAT 1740 

CTCTTCAAAC CAGGTCAGCT TCACCTTGCC GTAGGTATAT GTTACTGACT TCGTCAGTCT 1800 

TATCCGGCAA CCTCAAAACG GTGTTTTGAG CTCACTTCGT CAGTTCTATT TGCAACCTCA 1860 

AAACAGTGTT TTGAGCAACC TGTGACTAGC TTTCTAATCG ATGCCTTGGT TTTCATTGCC 1920 

TATAATCAAA AAGAGAAATT TTCTCCTGAA AAGCATATAG ACTAGCTGGC GTTAAAAGCT 1980 

CCTGTCTTGC TTTTTTGACC TATAGTCACA TCTATCAAGT ATTGTTCTTG CCTAAGCTAT 2040 

CAATAAAAAG GTGGCATTTT TTAGGCTTGG TGTTAGTAGA TTTTGCCTTA TCCTATCTAA 2100 

GTCATTTCGA ACTTTTTATG GTACAATGGA AACATGTTAT TCAAATTATC TAAGGAAAAA 2160 

ATAGAGCTAG GCTTATCTCG TTTATCGCCA GCCCGTCGTA TTTTTTTGAG TTTTGCCTTG 2220 
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GTCATTTTAC TAGGCTCTCT TCTTTTGAGC TTGCCCTTTG TCCAAGTTGA AAGCTCACGA 2280 

GCGACTTATT TTGATCATCT TTTCACTGCT GTCTCTGCAG TCTGTGTGAC GGGTCTCTCA 2340 

ACCCTTCCAG TAGCTCACAC CTATAATATC TGGGGTCAAA TAATCTGTTT GCTCTTGATT 2400 

CAGATCGGTG GTCTAGGGCT CATGACCTTT ATTGGGGTTT TCTATATCCA GAGCAAGCAA 2460 

AAGCTTACTC TTCGTAGCCG TGCAACTATT CAGGATAGTT TTAGTTATGC AGAAACTCGA 2520 

TCTTTGAGAA AGTTTGTCTA TTCTATTTTT CTCACGACCT TTTTGGTTGA GAGCTTGGGA 2580 

GCTATTTTGC TTAGTTTTCG CCTTATTCCT CAACTTGGCT GGGGACGTGG TCTTTTTACT 2640 

TCCATTTTTC TAGCGATCTC AGCCTTCTGT AATGCCGGTT TTGATAATTT AGGGAGCACC 2700 

AGTTTATTTG CTTTTCAGAC CGATTTACTG CTCAATCTGG TGATTGCAGG CTTGATTATT 2760 

ACAGGCGGCC TTGGTTTTAT GGTCTGGTTT GATTTGGCTG GTCATGTAGG AAGAAAGAAA 2820 

AAAGGACGTC TGCACTTTCA TACGAAGCTT GT ACT ATT AT TGACTATAGG TTTGTTGTTA 2880 

TTTGGAACAG CAACTACTCT CTTTCTTGAG TGGAACAATG CTGGAACGAT TGGCAATCTC 2940 

CCTGTTGCCG ATAAGGTTTT AGTTAGCTTT TTTCAAACAG TGACGATGCG AACAGCTGGC 3000 

TTTTCTACGA TAGATTATAC TCAGGCTCAT CCTGTGACTC TTTTGATTTA TATCTTACAG 3060 

ATGTTTCTAG GTGGGGCACC TGGAGGAACA GCTGGGGGAC TCAAGATTAC GACATTTTTT 3120 

GTCCTCTTGG TCTTTGCACG AAGTGAGCTT CTAGGCTTGC CTCATGCCAA TGTTGCGAGA 3180 

CGAACGATCG CGCCGCGAAC GGTTCAAAAA TCCTTTAGTG TCTTTATTAT CTTTTTGATG 3240 

AGCTTCTTGA TAGGATTGAT TCTGCTAGGG ATAACAGCCA AAGGCAATCC TCCCTTTATC 3 300 

CACCTCGTAT TTGAAACC AT TTCAGCTCTT AGTACAGTTG GTGTAACGGC AAATCTGACT 33 60 

CCTGACCTTG GGAAATTGGC TCTCAGTGTT ATCATGCCAC TTATCTTTAT GGGACGAATT 34 20 

GGTCCCTTGA CCTTGTTTGT TAGCTTGGCA GATTACCATC CAGAAAAGAA AGATATGATT 3480 

CACTATATGA AAGCAGATAT TAGTATTGGT TAAGAAAGGA AAGAGCATGT CAGATCGTAC 3540 

GATTGGAATT TTGGGCTTGG GAATTTTTGG GAGCAGTGTC CTAGCTGCCC TAGCCAAGCA 3600 

GGATATGAAT ATTATCGCTA TTGATGACCA CCCAGAGCGC ATCAATCAGT TTGAGCCAGT 3660 

TTTGGCGCGT GGAGTGATTG GTGACATCAC AGATGAAGAA TTATTGAGAT CAGCAGGGAT 3720 

TGATACCTGC GATACCCTTG TAGTCGCGAC AGGTGAAAAT CTGGAGTCGA GTCTGCTTGC 3780 

GGTTATGCAC TGTAAGAGTT TGGGGGTACC GACTGTTATT GCTAAGGTCA AAAGTCAGAC 3 840 

CGCTAAGAAA GTGCTAGAAA AGATTGGAGC TGACTCGGTT ATCTCGCCAG AGTATGAAAT 3900 

GGGGCAGTCT CTAGCACAGA CCATTCTTTT CCATAATAGT GTTGATGTCT TTCAGTTGGA 3960 

TAAAAATGTG TCTATCGTGG AGATGAAAAT TCCTCAGTCT TGGGCAGGTC AAAGTCTGAG 4020 
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TAAATTAGAC 


CTCCGTGGCA 


AATACAATCT 


GAATATTTTG 


GGTTTCCGAG 


AGCAGGAAAA 


4080 


TTCCCCATTG 


GATGTTGAAT 


TTGGACCAGA 


TGACCTCTTG 


AAAGCAGATA 


CCTATATTTT 


4140 


GGCAGTCATC 


AACAACCAGT 


ATTTGGATAC 


CCTAGTAGCA 


TTGAATTCGT 


AAAGAGGGAT 


4200 


GACCCCTCTT 


TTTTGATGCC 


TAAGATGGCA 


AATAGAGACA 


GAAGCCCCTT 


GTCTTCTAGT 


4260 


AAAAGTTCTT 


CAAAGGCTGG 


ACTTTATGGT 


AAAATAGAAA 


GAAGTGACAA 


GAGAGAGTAA 


4320 


TACTCAATGA 


AAATCAAAGA 


TCAAACTAGG 


AAACTAGCTA 


CGCGCTCCTC 


AAAACACTGT 


4380 


TTTGAGCTTG 


CAGATAGAAC 


TGACGAAGTC 


AGTAACATCT 


ATACGGCAAG 


GCGACGTTGA 


4440 


CGCGGTTTGA 


AGAGATTTTC 


GAAGAGTATA 


AGAAAAAATC 


AGTCCCCTAA 


AGGAGTAGAT 


4500 


TATGAAGTTA 


TTGTCTATCG 


CAATTTCTAG 


CTATAATGCA 


GCAGCCTATC 


TTCATTACTG 


4560 


TGTGGAGTCG 


CTAGTGATTG 


GTGGTGAGCA 


AGTTGGGATT 


TTGATTATCA 


ATGACGGGTC 


4620 


TCAGGATCAG 


ACTCAGGAAA 


TCGCTGAGTG 


TTTAGCTAGC 


AAGTATCCTA 


ATATCGTTAG 


4680 


AGCCATCTAT 


CAGGAAAATA 


AATGCCATGG 


CGGTGCGGTC 


AATCGTGGCT 


TGGTAGAGGC 


4740 


TTCTGGGCGC 


TATTTTAAAG 


TAGTTGACAG 


TGATGACTGG 


GTGGATCCTC 


GTGCCTACTT 


4800 


GAAAATTCTT 


GAAACCTTGC 


AGGAACTTGA 


GAGCAAAGGT 


CAAGAGGTGG 


ATGTCTTTGT 


4860 


GACCAATTTT GTCTATGAAA AGGAAGGGCA GTCTCGTAAG 


AAGAGTATGA 


CTTACGATTC 


4920 


AGTCTTGCCT 


GTTCGGCAGA 


TTTTTGGCTG 


GGACCAGGTC 


GGAAATTTCT 


CCAAAGGCCA 


4980 


GTATACCATG 


ATGCACTCGC 


TGATTTATCG 


GACAGATTTG 


TTGCGTGCTA 


GCCAGTTCTA 


5040 


ACTGCCTGAA 


CATACTTTTT 


ATGTCCATAA 


TCTCTTTGTC 


TTTACGCCCC 


TTCAGCAGGT 


5100 


CAAGACCATG 


TACTATCTGC 


CTGTCGATTT 


CTATCGTTAT 


TTGATTGGGC 


GTGAGGACCA 


5160 


GTCTGTCAAT 


GAGCAAGTGA 


TGATTAAGTG 


CATTGACCAG 


CAACTCAAGG 


TCAATCGACT 


5220 


CTTGATAGAC 


CAACTTGATT 


TGTCCCAAGT 


GACTCATCCC 


AAAATGCGAG 


AATATCTGCT 


5280 


GAATCATATT 


GAACTCACGA 


CGGTGATTTC 


CAGTACCCTG 


CTCAACCGAT 


CTGGAACAGC 


5340 


GGAGCATCTG 


GCAAAAAAAC 


GCCAATTGTG 


GACCTATATT 


CAGCAGAAAA 


ATCCAGAAGT 


5400 


CTTTCAGGCT 


ATTCGTAAGA 


CCATGTTGAG 


CCGTTTGACC 


AAACATTCTG 


TCTTGCCAGA 


5460 


TCGCAAACTG 


TCCAATGTCG 


TCTATCAAAT 


CACCAAATCT 


GTTTATGGAT 


TTAATTAATA 


5520 


TAAGTGTTTT 


ATAAGAGGGA 


TTTAAGAAAA 


ATTTTAACTT 


TTTCTTAGTC 


CTTTTTAATT 


5580 


TCAGGAGATT 


ATACTAGAGT 


CATCAAATAA 


AGAAAGACTC 


TAAGGAGAAT 


CCTATGAAAT 


5640 


TCAATCCAAA 


TCAAAGATAT 


ACTCGTTGGT 


CTATTCGCCG 


TCTCAGTGTC 


GGTGTTGCCT 


5700 


CAGTTGTTGT 


GGCTAGTGGC 


TTCTTTGTCC 


TAGTTGGTCA 


GCCAAGTTCT 


GTACGTGCCG 


5760 
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ATGGGCTCAA TCCAACCCCA GGTCAAGTCT TACCTGAAGA GACATCGGGA ACGAAAGAGG 5620 

GTCACTTATC AGAAAAACCA GGAGACACCG TTCTCACTCA AGCGAAACCT GAGGGCGTTA 5880 

CTGGAAATAC GAATTCACTT CCGACACCTA CAGAAAGAAC TGAAGTGAGC GAGGAAACAA 5940 

GCCCTTCTAG TCTGGATACA CTTTTTGAAA AAGATGAAGA AGCTCAAAAA AATCCAGAGC 6000 

TAACAGATGT CTTAAAAGAA ACTGTAGATA CAGCTGATGT GGATGGGACA CAAGCAAGTC 6060 

CAGCAGAAAC TACTCCTGAA CAAGTAAAAG GTGGAGTGAA AGAAAATACA AAAGACAGCA 6120 

TCGATGTTCC TGCTGCTTAT CTTGAAAAAG CTGAAGGGAA AGGTCCTTTC ACTGCCGGTG 6180 

TAAACCAAGT AATTCCTTAT GAACTATTCG CTGGTGATGG TATGTTAACT CGTCTATTAC 624 0 

TAAAAGCTTC GGATAATGCT CCTTGGTCTG ACAATGGTAC TGCTAAAAAT CCTGCTTTAC 6300 

CTCCTCTTGA AGGATTAACA AAAGGGAAAT ACTTCTATGA AGTAGACTTA AATGGCAATA 6360 

CTGTTGGTAA ACAAGGTCAA GCTTTAATTG ATCAACTTCG CGCTAATGGT ACTCAAACTT 64 20 

ATAAAGCTAC TGTTAAAGTT TACGGAAATA AAGACGGTAA AGCTGACTTG ACTAATCTAG 64 80 

TTGCTACTAA AAATGTAGAC ATCAACATCA ATGGATTAGT TGCTAAAGAA ACAGTTCAAA 6540 

AAGCCGTTGC AGACAACGTT AAAGAC AGT A TCGATGTTCC AGCAGCCTAC CTAGAAAAAG 6600 

CCAAGGGTGA AGGTCCATTC ACAGCAGGTG TCAACCATGT GATTCCATAC GAACTCTTCG 6660 

CAGGTGATGG CATGTTGACT CGTCTCTTGC TCAAGGCATC TGACAAGGCA CCATGGTCAC 6720 

ATAACGGCGA CGCTAAAAAC CCAGCCCTAT CTCCACTAGG CGAAAACGTG AAGACCAAAG 6780 

GTCAATACTT CTATCAAGTA GCCTTGGACG GAAATGTAGC TGGCAAAGAA AAACAAGCGC 6840 

TCATTGACCA GTTCCGAGCA AAyGGTACTC AAACTTACAG CGCTACAGTC AATGTCTATC 6900 

GTAACAAAGA CGGTAAACCA GACTTGGACA ACATCGTAGC AACTAAAAAA GTCACTATTA 6960 

ACATAAACGG TTTAATTTCT AAAGAAACAG TTCAAAAAGC CGTTGCAGAC AACGTTAAAG 7020 

ACAGTATCGA TGTTCCAGCA GCCTACCTAG AAAAAGCCAA GGGTGAACGT CCATTCACAG 7080 

CAGGTGTCAA CCATGTGATT CCATACGAAC TCTTCGCAGG TGATGGTATG TTGACTCGTC 7140 

TCTTGCTCAA GGC ATCTGAC AAGGCACCAT GGTCAGATAA CGGTGACGCT AAAAACCCAG 7200 

CCCTATCTCC ACTAGGTGAA AACGTGAAGA CCAAAGGTCA ATACTTCTAT CAATTAGCCT 7260 

TGGACGGAAA TGTAGCTGGC AAAGAAAAAC AAGCGCTCAT TGACCACTTC CGAGCAAACG 7320 

GTACTCAAAC TTACAGCGCT ACAGTCAATG TCTATGGTAA CAAAGACGGT AAACCAGACT 7380 

TGGACAACAT CGTAGCAACT AAAAAAGTCA CTATTAACAT AAACGGTTTA ATTTCTAAAG 7440 

AAACAGTTCA AAAAGCCGTT GCAGACAACG TTAAGGACAG TATCGATGTT CCAGCAGCCT 7500 

ACCTAGAAAA GGCCAAGGGT GAAGGTCCAT TCACAGCAGG TGTCAACCAT GTGATTCCAT 7560 
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* 



ACGAACTCTT 


CGCAGGTGAT 


GGCATGTTGA 


CTCGTCTCTT 


GCTCAAGGCA 


TCTGACAAGG 


7620 


CACCATGGTC 


AGATAACGGC 


GACGCTAAAA 


ACCCAGCTCT 


ATCTCCACTA 


GGTGAAAACG 


7680 


TGAAGACCAA 


AGGTCAATAC 


TTCTATCAAG 


TAGCCTTGGA 


CGGAAATGTA 


GCTGGCAAAG 


7740 


AAAAACAAGC 


GCTCATTGAC 


CAGTTCCGAG 


CAAACGGTAC 


TCAAACTTAC 


AGCGCTACAG 


7800 


TCAATGTCTA 


TGGTAACAAA 


GACGGTAAAC 


CAGACTTGGA 


CAACATCGTA 


GCAACTAAAA 


7860 


AAGTCACTAT 


TAAGATAAAT 


GTTAAAGAAA 


CATCAGACAC 


AGCAAATGGT 


TCATTATCAC 


7920 


CTTCTAACTC 


TGGTTCTGGC 


GTGACTCCGA 


TGAATCACAA 


TCATGCTACA 


GGTACTACAG 


7980 


ATAGCATGCC 


TGCTGACACC 


ATGACAAGTT 


CTACCAACAC 


GATGGCAGGT 


GAAAACATGG 


8040 


CTGCTTCTGC 


TAACAAGATG 


TCTGATACGA 


TGATGTCAGA 


GGATAAAGCT 


ATGCTACCAA 


8100 


ATACTGGTGA 


GACTCAAACA 


TCAATGGCAA 


GTATTGGTTT 


CCTTGGGCTT 


GCGCTTGCAG 


8160 


GTTTACTCGG 


TGGTCTAGGT 


TTGAAAAACA 


AAAAAGAAGA 


AAACTAATCA 


GCTAAGGAAA 


8220 


TAAATGATGG 


ATAGTGGGCT 


GACTAAGATT 


AGTTTAACAA 


CTCAATCAGC 


AATCAGGACT 


8280 


TTCTTTCAAT 


AGCAGATTAA 


AATCATCGTA 


AAACAATAAA 


AATAGTGTTA 


TACTTAAAGC 


8340 


AGTATAGCAC 


TGTTTTTATC 


AAAGGAGAGA 


CAGATGGGAA 


AGACAATTTT 


ACTCGTTGAC 


8400 


GACGAGGTAG 


AAATCACAGA 


TATTCATCAG 


AGATACTTAA 


TTCAGGCAGG 


TTATCAGGTC 


8460 


TTGGTAGCCC 


ATGATGGACT 


GGAAGCGCTA 


GAGCTGTTCA 


AGAAAAAACC 


GATTGATTTG 


8520 


ATTATCACAG 


ATGTCATGAT 


GCCTCGGATG 


GATGGTTATG 


ATTTAATCAG 


TGACGTTCAA 


8580 


TACTTATCAC 


CAGAGCAGCC 


TTTCCTATTT 


ATTACTGCTA 


AGACCAGTGA 


ACAGGACAAG 


8640 


ATTTACGGCC 


TGAGCTTGGG 


AGCAGATGAT 


TTTATTGCTA 


AGCCTTTTAG 


CCCACGTGAG 


8700 


CTGGTTTTGC 


GTGTCCACAA 


TATTTTGCGC 


CGCCTTCATC 


GTGGGGGCGA 


AACAGAGCTG 


B760 


ATTTCCCTTG 


GCAATCTAAA 


AATGAATCAT 


AGTAGTCATG 


AAGTTCAAAT 


AGGAGAAGAA 


8820 


ATGCTGGATT 


TAACTGTTAA 


ATCATTTGAA 


TTGCTGTGGA 


TTTTAGCTAG 


TAATCCAGAG 


8880 


CGAGTTTTCT 


CCAAGACAGA 


CCTCTATGAA 


AAGATCTGGA 


AAGAAGACTA 


CGTGGATGAC 


8940 


ACCAATACCT 


TGAATGTGCA 


TATCCATGCT 


CTTCGACAGG 


AGCTGGCAAA 


ATATAGTAGT 


9000 


GACCAAACTC 


CCACTATTAA 


GACAGTTTGG 


GGGTTGGGAT 


ATAAGATAGA 


GAAACCGAGA 


9060 


GGACAAACAT 


GAAACTAAAA 


AGTTATATTT 


TGGTTGGATA 


TATTATTTCA 


ACCCTCTTAA 


9120 


CCATTTTGGT 


TGTTTTTTGG 


GCTGTTCAAA 


AAATGCTGAT 


TGCGAAAGGC 


GAGATTTACT 


9180 


TTTTGCTTGG 


GATGACCATC 


GTTGCCAGCC 


TTGTCGGTGC 


TGGGATTAGT 


CTCTTTCTCC 


9240 


TATTGCCAGT 


CTTTACGTCG 


TTGGGCAAAC 


TCAAGGAGCA 


TGCCAAGCGG 


GTAGCGGCCA 


9300 
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AGGATTTTCC TTCAAATTTG GAGGTTCAAG GTCCTGTAGA AfFPSASCAA TTAGGGCAAA 9360 

CTTTTAATGA GATGTCCCAT GATTTGCAGG TAAGCTTTGA TTCCTTGGAA GAAAGCGAAC 9420 

GAGAAAAGGG CTTGATGATT GCCCAGTTGT CGCATGATAT TAAGACTCCT ATCACTTCGA 9480 

TCCAAGCGAC GGTAGAAGGG ATTTTGGATG GGATTATCAA GGAGTCGGAG CAAGCTCATT 9540 

ATCTAGCAAC CATTGGACGC CAGACGGAGA GGCTCAATAA ACTGGTTGAG GACTTGAATT 9600 

TTTTGACCCT AAACACAGCT AGAAATCAGG TGGAAACTAC CAGTAAAGAC AGTATTTTTC 9660 

TGGACAAGCT CTTAATTGAG TGCATGAGTG AATTTCAGTT TTTGATTGAG CAGGAGAGAA 9720 

GAGATGTCCA CTTGCAGGTA ATCCCAGAGT CTGCCCGGAT TGAGGGAGAT TATGCTAAGC 9780 

TTTCTCGTAT CTTGGTGAAT CTGGTCGATA ACGCTTTTAA ATATTCTGCT CCAGGAACCA 9840 

AGCTGGAAGT GGTGGCTAAG CTGGAGAAGG ACCAGCTTTC AATCAGTGTG ACCGATGAAG 9900 

GGCAGGGTAT TGCCCCAGAG GATTTGGAAA ATATTTTCAA ACGCCTTTAT CGTGTCGAAA 9960 

CTTCGCGTAA CATGAAGACA GGTGGTCATG GATTAGGACT TGCGATTGCG CGTGAATTGG 10020 

CCCATCAATT GGGTGGGGAA ATCACAGTCA GCAGCCAGTA CGGTCTAGGA AGTACCTTTA 10080 

CCCTCGTTCT CAACCTCTCT GGTAGTGAAA ATAAAGCCTA AAACCCCTTT ACAAATCCAG 10140 

CTATTCATGG TAGAATAGAT TTTGTGTGAA ATATCAGCAG GAAAGCATGA AGCTCGTCAA 10200 

CAGGTGTCTT ATGACAAGTA ACCTTGGCTG TTTAGGCGAA GGGCATCTGC ACGG 10254 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

tA» LENGTH: 9769 base pairs 
tB) TYPE: nucleic acid 
tC) STRANDEDNESS : double 
CD) TOPOLOGY: linear 



txi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CCGGCGACTA TCGATAACAC TTGACTTGGT AGCCCCACAT TTTGGACAAC GCATCCTTTC 60 

CCTCCTTATC GTTTTCTTTT CATTATACCA TTTTTTAAGC GATTCCCAAA ACAATTCTTC 120 

TTTTTGCTTG ACAAGTTTTT TGTTTTGTTG TATTATTTAA TTAAGACAAC AAGGTAAAAG 180 

AAAGGAGACT AAGATGTCCT GGACATTTGA CAACAAAAAA CCCATCTATT TACAGATTAT 240 

GGAGAAAATC AAGCTTCAGA TTGTTTCCCA TACACTGGAA CCCAATCAAC AACTTCCAAC 300 

CGTGAGGAGC TAGCTAGCGA GGCTGGTGTC AATCCCAATA CCATCCAAAG AGCCTTATCA 3 60 

GACCTTGAAC GAGAAGGATT TGTCTACAGC AAGCGAACAA CTGGACGATT TGTGACTAAG 420 

GATAAGGAGC TAATCGCCCA GTCACGCAAA CAATTATCAG AAGAAGAATT GGAACACTTC 480 
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GTTTCCTCCA 


TGACCCATTT 


TGGCTATGAA 


AAAGAAGAAC 


TACCAGGCG? 


AGTCAGTGAT 


540 


TATATTAAAG 


GAGTTTAAGC 


CTATGTCATT 


ACTAGTATTT 


GAAAATGTAT 


CCAAATCATA 


600 


TGGAGCAACA 


CCAGCCCTTG 


AAAATGTTTC 


TCTTGACATT 


CCAGCTGGAA 


AAATTGTCGG 


660 


CCTTCTTGGG 


CCAAACGGCT 


CAGGAAAAAC 


AACCCTGATT 


AAACTAATTA 


ATGGCCTCTT 


720 


ACAACCAGAT 


CAAGGACGTG 


TCCTCATCAA 


CGACATGGAC 


CCAAGCCCAG 


CAACCAAGGC 


780 


CGTTGTAGCT 


TATTTGCCTG 


ATACGACCTA 


TCTCAATGAG 


CAAATGAAGG 


TCAAAGAAGC 


840 


CCTAACCTAC 


TTCAAGACCT 


TCTATAAAGA 


TTGTCAGATC 


TTGAACGCGC 


CCATCATCTA 


900 


CTTGCAGACC 


TGGGCATTGA 


TGAAAATAGT 


CGTCTCAAGA 


AACTATCAAA 


AGGAAACAAA 


960 


GAAAAGGTTC 


AACTGATTTT 


GGTTATGAGC 


CGTGATGCTC 


GTCTCTATGT 


TTTGGACGAA 


1020 


CCCATTGGTG 


GGGTGGATCC 


AGCAGCCCGT 


GCTTATATCC 


TCAATACCAT 


TATCAACAAC 


1080 


TACTCACCAA 


CTTCTACCGT 


TTTGATTTCT 


ACCCACTTGA 


TTTCTGATAT 


CGAGCCAATC 


1140 


TTGGATGAAA 


TTGTCTTCCT 


AAAAGACGGA 


AAAGTCGTCC 


GTCAAGGAAA 


TGTAGATGAT 


1200 


ATTCGCTACG 


AGTCAGGTGA 


ATCCATTGAC 


CAACTCTTCC 


GTCAGaATTT 


AAGGCCTAAG 


1260 


CAAAGGAGAT 


TATTTATGTT 


TTGGAATTTA 


GTTCGCTACG 


AATTTAAAAA 


TGTTAACAAG 


1320 


TGGTATTTAG 


CCCTCTACGC 


AGCCGTGCTA 


GTCCTTTCTG 


CCC7CATCCG 


AATACAGACA 


1380 


CAAGGCTTTA 


AAAATCTACC 


TTACCAAGAA 


AGTCAGGCTA 


CTATGCTACT 


TTTTCTAGCT 


1440 


ACAGTCTTTG 


GTGGCTTGAT 


GCTTACACTT 


GGGATTTCAA 


CCATTTTCTT 


GATTATTAAA 


1500 


CGCTTCAAAG 


GTAGTGTCTA 


CGACCGACAA 


GGCTATCTGA 


CTTTGACCTT 


CCCACTTTCT 


1560 


GAACACCATA 


TCATCACAGC 


CAAACTAATC 


GCTGCCTTTA 


TCTGGTCATT 


GATTAGCACC 


1620 


GCTGTATTGG 


CTCTAAGTGC 


TCTTATTATT 


CnCCTTTAA 


CACCTCCACA 


ATGCATTCCT 


1680 


CTTTCTTATG 


TGATTACATT 


TGTAGAAACA 


CATCTCCCTC 


AGATCTTTCT 


TACAGGTATA 


1740 


TCCTTCCTAC 


TAAATACTAT 


TTCAGGAATC 


CTCTGCATCT 


ACCTGGCTAT 


TTCCATTGGA 


1800 


CAGCTTTTCA 


ATGAATACCG 


TACAGCACTC 


GCTGTTGCAG 


TCTACATTGG 


TATCCAAATC 


1860 


GTCATTGGAT 


TTATTGAACT 


TTTCTTCAAT 


CTTAGTTCTA 


ATTTCTATGT 


CAATTCACTG 


1920 


GTAGGACTCA 


ATGACCATTT 


CTATATGGGA 


GCAGGTATAG 


CCATTGTTGA 


AGAACTCATA 


1980 


TTCATAGCTA 


TCTTTTATCT 


CGGAACCTAC 


TACATCTTGA 


GAAATAAGGT 


TAATTTGCTT 


2040 


TAAATAATTT 


TTACCTAGAT 


ATGTAACATA 


CTCATAGAAC 


AAAAGAGACC 


AGGCAAAAAG 


2100 


TCTTTAAAAT 


TAGAAAACGC 


ATAGTATCAG 


GTGTTGAATA 


TGTACTGCcC 


CCCAAAAGTT 


2160 


AGATTTTTTC 


TGTCTAACTT 


TTGGGGGCAG 


TTCATAAGAA 


CCTTGGTAAT 


ATGCGTTTTT 


2220 
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TGTGAGCTGA 


CTTATTTCCT 


TTCACTATAT 


CGCAAAATGA 


AATAAGAACG 


GAACGATGGG 


22S0 


ATTTTGGAAT 


TCAAATCAAT 


TTATAAGAAT 


GTTTTAGAAG 


TAATATTATC 


CTATTCCAGA 

^— ft> * * 4 A- ^mj^mtf+^m***- 


* JTIW 


TTCAGTTCAC 


TATACAATTG 


AGTTTTCAAG 


CAACCTGTTT 


ACATAATGTG 


TACATAATTA 




GGTTCGTGAT 


TCCACCCTTT 


TCACCTTTAA 


AAACCTCGCT 


TTCGCAAGGC 


A w 4 l V 1 n 1X1 


?4 fin 


ATAAGATAAG 


GCACGTTTAA 


AGGTTTTCCA 


AATCCCTAAA 


TCATCCCTTT 

4- ^™ * ■ A Smr 4 4 4 


GAAGAAGGAG 


icon 


ACTAGCATAC 


ATGCGTCCGA 


TAAATCCTGT 


TGCTACCACC 


GCAAAAATCA 




£ JoU 


AAGTGAAATC 


CATGCTTCTG 


CTCCCCCCGC 


ATAGTCATTA 

• * » • * m ^m •< « 4 4 


ATCGTTCGAA 


ACGGCATAAA 


* o ** u 


GAAGGTCGAA 


ATAAAGGGAA 


TATAAGAACC 


AATCTTCAAG 

■ mm m mi ^m m. m # *pi 


AGGAGATTGT 


CACCAGPTGC 


l IUU 


ACCTAGAGCT 


GTCACTCCAA 


AAAAACCACC 


CATAATCAAA 








TTTCCCTGAG 


TCCTCAGGAC 


GAGAAACCAT 

^P^ mm mw m^mt **m-m m 4- 


AGATCCTAGG 


AAGGf^TGr'PA 


AG A P*T A PPT A 


Za £U 


CATGAAAAGA 


CTGATCAAAA 


TAAAGAGCAA 


GGTATTCACT 


GAGATAGPAT 
unun i nOLrt i 






ATCCAAAATA 


CCAGACTGAG 


CCAAGAATGG 

^» 4%^>VPt 4 ^mJ^& 




AACAnfA A A A 


V. VJvj>_/\vaV. V. All 




ACCACCTACA 


ACATAGATCC 


CAATATGCGT 


TAAAATCACT 


AGAAAPAGAG 


r*r* a t*p a Trr p 


JUUU 


CGCATAGAAA 


TAGTGACTTG 


CCCTTATGCT 


AGAAAAAAfC 


A (T*TY* P A *T* A A 


III iuuTuLL 


JUoU 


TTTTTCACTG 


GCAACTTCCT 


GAGCTGTTAC 


ACCCGCATAG 


GTAATGAPAA 


TP AT ATA h. IP 
1 1- A 1 A 1 AAAv» 


J 1 £ 0 


AAAGAATCCT 


AAGGCACCTG 


CTGCAATTCT 


TTGAATAAAC 


till tnl I l 1 


U 1 1 vAtv. 1 IL 


lion 
J 1 HU 


ATCAATCTTT 


TCTGTGAATT 


GAATTGTCTG 


CGCTAAGCCT 


1111 1 O V. 1 


PTTGAGAPA A 




GGAAGCAGTT 


GAACGATTAA 


GCTGATTTTG 


CAGTTCATTG 


AGTPJTAf*f"TY" 




J J UU 


TTTAATTCCA 


TTTTCAAGCG 


ATGTTTCGCC 


ATGATAAACT 


GGCT' T *TAGAA 

1*1 nunA 


PAGTATPTTP 




TTGATCAATG 


GTCAAATAAC 


CTTTTAATTT 


TTCTTCTTTA 


r\ 1 1 \— 1 1 ^ 1 


TGGGAGTTGG 




TTCGTCTTTA 


TAGTCGAAGT 


TAACACCATT 


TACAT* r '*TTC 




PTAPARATr^: 


i ji on 


CACTGTTGTC 


ACTACTGCCA 


CTTTATTATT 


TTTAGCCATA 


GAAGAACCTT 


GGAGATGPPP 




AATTCCTACA 


GAGATTCCTA 


AAAAGAGGAA 


CGGCGAAATC 


ACCATAAAGA 


AGAAACTCCA 


3600 


TGACTCGACA 


TGTCGAAGAT 


AGGTTTCCTT 


GATTACAACC 


CACATATTTC 


TCATACTTCC 


3660 


ACTCCTGATT 


CTAGTTTAAA 


CATTTCATCG 


ATAGTTGGCG 


CTTGTTGGTC 


AAATGTTGCG 


3720 


ATATATTGAC 


CTTGAGTCAA 


GATTGAGAAG 


AGTTCCCTTC 


CAGCGCTCTC 


ATCCTCCAAA 


3780 


ATCAATTTCC 


AACTGCCTTG 


TTTGGTCAAG 


CTCACCTGTT 


TGACATGAGG 


AAGATTTTCC 


3840 


AATTCTTCCT 


TGCTTCGTTC 


ACTTGAAACA 


AAGAGACGCG 


TTTTCCCGTA 


TTGATTGCGG 


3900 


ACATCCTGAA 


CTGGTCCGTG 


CAAGACCACA 


CGGCCATCTC 


GGATCATCAG 


AATATCGTCA 


3960 


CAAAGTTCCT 


CAACATTGGT 


CATGACATGG 


TCAGAAAAGA 


TAATGGTTGT 


CCGCGCTCTT 


4020 
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TTTCCTGAAA AATGACTTGT TTGAGCAATT CTGTATTAAC TGGGTCCAAT CCACTAAAAG 4080 

GCTCATCCAA GATAATCAGG TCTGGTTCAT GAATCAGAGT AATAATGAGC TGAATCTTCT 4140 

GCTGATTTCC TTTTGACAGA CTCTTGATTT TATCTGTCAG CTTTCCTTTC ACTTCCAACC 4200 

TCTTCATCCA TTGAGGGAGT TTTTCTTTGA CTTCTTTGGC ATCCATGCCT TTTAGAGTCG 4260 

CCAAGTAGCG AACTTGTTCA AGAACTGTCA ATTTAGGCAT GAGATGCGTT CTTCAGGCAG 4320 

ATAACCAATC CGAGCATAGG TCTCCTGACG AATATCCTGA CCATCCAGAC CGATTTCTCC 43 80 

CTGATATTCT AGGAATTTCA AAATACTATG GAAAATCGTT GTTTTTCCAG CACCATTTTT 4440 

TCCGACTAGT CCCAAAATAC GACCTGGTCG CGCTTGAAAG TCAATACCAA ACAAAACTTG 4500 

CTTGGATCCA AAACTTTTCT CTAGACTTCT TACTTCTAGC ATCTTTCACC TCCGAAATTT 4560 

CTTGCACTCA TTATACTCCT TTTTGATAGC CTTTACAATG TTTTTTGTCC ATTTTTAGAA 4620 

GACTATTGCT GTGTAAAATA TGGCCTGGAG CACTTTTATA CTCAATGAAA ATCAAAGAGC 4680 

AAACTAGGAA GCTAGCCGTA GACTGCTCAA AGTACAGCTT TGAGGTTGCA GATAAAACTG 47 4 0 

ACGAAGTCgA CTCAAAACAC TGTTTTGAGG TTCTGGATAG AACTGACGAA kCrTAaCTAT 4800 

ATCTACGGCA AGGCGAAcTG ACGTGGTTTG AAGAGATTTT CGAAGAGTAT TAGTGATAAA 4860 

TCCATTATAC AGCAGCAAAC TTAATTTATA CCTTCCGCTC CTCAACTGTC TATTTTTAAT 4920 

CCTGAATTGT TATTTGAGTA ACTCCTTTTT CCTCGTAAAG TTTTCTTCCT CTAAAACTTC 4980 

TGGAAAAAGG CTAATAGTTT CAGACAACAT TTTTATAAGA AACAAGTTCA TCTGTCATTT 5040 

CAAGAAGGAG TAATCCTTTA TCTACTAATG GACGGAACAG AATTCAACCG CTTGTCCGAT 5100 

ATGTTTTCTA AGGATTATAT AGTAAAATGA AATAAGAACA GGACAAATTG ATCAGGACAG 5160 

TCAAATTGAT TTCTAACAAT CTTTTAGAAG TAGATGTATA CTATTCTAGT TTCAATCTGC 5220 

TATATCTATT ATGCACACCC CTATAGGATC TAATGAAAAT CACAACAGGC TCATTCATAG 5280 

ATGCTTACCT AAGCCTAAGG GAACTAAGAA AACGACTACC AAGGAAGTCG CATTCATCGA 5340 

AAAGTAGATT AACAACTATC CTAAAAAATG CTTGAACTAC AAGTCCCCCA GAGAAGACTT 5400 

CTGGATGACT AACTTGAACT TGAAATTTAG CAATAATTAA TTCACTATCT AACTATATTT 5460 

AGTAATTATT TCAGAACTGA TTAATATTAA AATTAACTAA CAATTCAAAG GATTCATACT 5520 

AGCCATAAAT TACGTCCATC AGAGAGAGAC TCTTACTACT TTTAGATTTT AGTCTTTCTA 5580 

GCTTCAGAAT ACATCTAAAC TTTAGGGAAA ATGACTATTC GAAAGCGCGA ATGCCTCAAA 5640 

ATTATCTCAG ATAAGCTATT CGAAACTTAG AATCCTTTTA AATTTATGGA ATTGCGATTA 5700 

TTCGAAACCT AGAATGCATA TAACCTTTAG TTGACAGACC TATTCTAAGT CTCGAAGGGC 5760 
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TATTTACTTT CTATTCCTTA TCAAAAAAGA CTCATTCCCC CTTTCTCCTC CAAAATATGG 5820 

TATAGTAGAA ATATACTATC TATGAGGAGT TTACATGTCA CAGGATAAAC AAATGAAAGC 5880 

TGTTTCTCCC CTTCTGCAGC GAGTTATCAA TATCTCATCG ATTGTCGGTG GGGTTGGGAG 5940 

TTTGATTTTC TGTATTTGGG CTTATCAGGC TGGGATTTTA CAATCCAAGG AAACCCTCTC 6000 

TGCCTTTATC CAGCAGGCAG GCATCTGGGG TCCACCTCTC TTTATCTTTT TACAGATTTT 6060 

ACAGACTGTC GTCCCTATCA TTCCAGGGGC CTTGACCTCG GTGGCTGGGG TCTTTATCTA 6120 

CGGGCACATC ATCGGGACTA TCTACAACTA TATCGGCATC GTGATTGGCT GTGCCATTAT 6180 

CTTTTATCTA GTGCGCCTAT ACGGAGCTGC CTTTGTCCAG TCTGTCGTCA GCAAGCGCAC 6240 

CTACGACAAG TACATCGACT GGCTAGATAA GGGCAATCGT TTTGACCGCT TCTTTATTTT 6300 

TATGATGATT TGGCCCATTA GCCCAGCTGA CTTTCTCTGT ATGCTGGCTG CCCTGACCAA 63 60 

GATGAGCTTC AAGCGCTACA TGACCATCAT CATTCTGACC AAACCCTTTA CCCTCGTGGT 6420 

TTATACCTAC GGTCTGACCT ATATTATTGA CTTTTTCTGG CAAATGCTTT GACACGTAAA 6480 

AAATCCGTTT GGTTTCCCAA GTGGATTTTT AAAGCGTAGA TTAACTATAG CTTGATACTA 6540 

AATATACTTT GGTATGGAAA TCATGCATAT TTTTCGATAG TGACGCCAGC ACTTACCTAG 6600 

CCTTTCCGCC GTGATAGAAA CACCTGAAAT CTAATGGTTT CAGGTATTCG GAAACTTTGA 6660 

GCCTAGTGTC TCAAAGTTTA GGTATGGAAT TTTGAAGAAA GTCGCTACCG TCCGTAATCA 6720 

CTTAAGGAAA GGCTCAAAAA TATTGTTTTC AACCACAAAA TCCGTTTGGT TTCCCAAGCG 6780 

GATTTTGTGC TTTATTTTGA AACTTCTTTT GCAAGAACAA AGTTCCCAAG TGTGGCAGAA 6840 

CCATTTCCTG CGACTGCTGG CGTCACGATA TAGTCACGCA CATCTGGTAC TGGTAGGTAA 6900 

CCATTAAGAA GAGATGTAAA TTTCTCACGG ACACGGTCCA GCATATGTTG TTGAGCCATG 6960 

ACCCCTCCAC CAAAGACAAT CACGTCTGGG CGGAAAGTCA CTGTCGCATT AACCGCAGCT 7020 

TGAGCGATAT AGTAGGCTTG AACATCCCAA ACAGGGTTGT TGAGTTCAAT AGTTTCCCCA 7080 

CGTACACCTG TACGAGCTTC CAAACTTGGA CCAGCTGCAT AAC CTTCT AG ACATCCCTTA 7140 

TGGAAAGGAC AAACACCCTT AAACTCTTTT TCAATATCCA TTGGGTGTCT AGCAACATAA 7200 

TAATGACCCA TTTCAGGGTG ACCCACACCA CCGATAAACT CACCACGTTG GATGACGCCT 7260 

GCACCGATAC CTGTACCGAT TGTGTAGTAA ACCAAGTTTT CGATACGACC ACCAGCATTG 7320 

TTACGGGCAA CCATTTCACC GTAAGCAGAG CTGTTTACGT CTGTTGTGAA CTACATTGGC 7380 

ACGTTTAGGG CGCGACGAAG GGCACCAACC AAGTCTACAT TTGCCCAGTT TGGTTTTGGA 7440 

CTCGTCGTGA TAAAGCCATA AGTTTTTGAG TTTTTGTCAA TATCAATCGG CCCAAATGAA 7500 

CCAACTGCAA GACCAGCAAG GTTATCCAAT TTTGAGAAGA ACTCAATGGT TTTATCGATT 7560 
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GTTTCGATTG GAGTTGTTGT TGGAAATTGT GTTTTTTCTA CAACGTTAAA GTTTTCATCA "7 620 

CCGACAGCAC AGACAAACTT TGTACCGCCC GCTTCCAAGC TTCCATATAA TTTTGTCATG 7 680 

ATAAACCTCT TGTTTTTATT TTCTTTATTA TAGCATACTT CGAAAGTCTA AATGTCTCTA 7740 

TTTTTTAGAT TTTCCTCTGT AAATCTTACT ATCTAATAAA AACGAACAAA CATGTCATTT 7800 

GTTCGTTTTC ACATTAGAGA GGATTGATTA GATTTTCACT TCGATCACAG CATCCCCCTT 7860 

AGCAACTGAA CCTGTTGCGA CTGGAGCTAC TGAAGCGTAG TCACCTGTAT TTGTAACGAT 7920 

AACCATTGTT GTATCATCAA GTCCAGCTGC AGCGATTTTG TTTGAGTCAA ATGTTCCAAG 7980 

AACATCGCCA CCTTTCACCT TATTACCTTG AGCAACTTTT GTTTCAAAAC CGTCACCGTT 8040 

CATAGATACA GTATCAATAC CAACATGAAT CAAAACTTCA GCACCATTTC TTGTTTTCAA 8100 

ACCAAAAGCG TGCCCTGTTG GAAAGGCAAT TGAAACTTCA GCATCAGCTG GTGCATAGAC 8160 

CACGCCTTGG CTTCGTTTCA CAACGATACC TTGTCCCATA GCTCCACTTG AGAAGACTGG 8220 

GTCATTGACA TCAGCAAGAG CGACAACATC ACCGACGATA GGAGTTACAA GTGTTTCATT 8280 

TTGAAGACCT GCTGGCGCAA CTTCTTCTTT TTCTTCAGCC ACTTCAGCTC GTTTTGCAGC 8340 

TGCAGTTGCG TCTACTTCAT CTTCGTAACC AAACATGTAA GTAAGAGCAA AACCAAGGGC 8400 

AAATGATACA GCTACCATAA GAAGGTATTG TGGAAGTTGT CCGTTACCAA CATAAAGCAT 84 60 

TGTACCAGGG ATGATGGTGA TACCATTACC AGTACCAGCA AGTCCAAGGA TAGAAGCCAA 8520 

TCCACCACCG ATTGCACCAG CAATCAATGA AAGGAAGAAT CGTTTACGGA AGCCCAAGTT 8580 

CACCCCGAAG ATAGCAGGCT CTCTAATACC 7AGGAAGGCA GAAAGAGCAG CCGGGAAAGC 8640 

AAGTGTTTTC AGTTTTGGAT TTTTTGTTTT AACACCAACC GCAACAGTAG CAGCACCTTG 8700 

AGCTGTCATA GCAGCTGTGA TGATAGCGTT GAATGGGTTA GCATGGTCAG CAGCAAGTAA 8760 

TTGCACTTCA AGCAAGTTGA AGATGTGGTG CACACCTGAC ACGACGATCA ATTGGTGAAC 8820 

CCCACCAATC AAGAAACCAC CAAGACCAAA TGGCATGCTA AGAATCGCTT TTGTAGCAAT 8880 

AAGGATGTAG TTTTCAACAA CGTGGAAAAC TGGTCCAATG ACAAAGAGTC CAAGGATAGA 8940 

CATGACCAAA AGTGTCACGA ATGGTGTTAC CAAGAGGTCA ATGACATCTG CAACAACTTG 9000 

CGGACAGCTT TTTCAAATTT AGCTCCGACA ACCCCGATGA TGAAGGCTGG AAGAACGGAA 9060 

CCTTGCAAAC CAACAACAGG GATGAAACCA AAGAAGTTCA TCCCTGTTAC TTCACCACCT 9120 

TGAGCAACTG CCCAAGCGTT TGGAAGTGAG CCAGAGACAA GCATCATACC AAGAACGATA 9180 

CCAACGGCAG GATTTCCACC AAATACACGG AAGGTTGACC ACACAACCAA ACCTGGCAAG 9240 

ATGATGAAGG CTGTATCTGT CAAGATTTGT GTGTAAGTTG CAAAGTCACC TGGAAGTGGC 9300 
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ATTTCAACAG CGTTGAAAAG ACCACGCACA CCCATGAAGA GACCTGTCGC TACGATAACT 9360 

GGGATGATTG GAACGAAAAC ATCACCAAAA GTACGGATAG CACGTTGGAA CCAGTTCCCT 9420 

TGTTTAGCAA CTTCTGCTTT CATGTCATCC TTAGATGATG TTGGTAATCC AAGTACAACA 9480 

ACTTCATCGT ACATTTTGTT AACTGTACCT GTACCAAAGA TAATTTGGTA TTGCCCTGAG 9540 

TTAAAGAAAG CACCTTGAAC TTTTTCCAAG TTCTCAATCA CTTCTTTATT GATTTTCTCT 9600 

TCATCTTTGA CCATGACACG TAGACGAGTC GCACAGTGGG CAACACTATT GACATTTTCA 9660 

CGTCCGCCCA AGGCATCGAT GACTTTTTTT GCAATTTCCT GATTGTTCAT TTGCAAAAAT 9720 

CTCCTTATAT AACATTTTGT TCTTGTTTGA AAGCGATTTT ATTCGCCGG 9769 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3149 base pairs 
< B > TYPE: nucleic acid 
(C) STRANDEDNESS : double 
<D> TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CGCTTGAGTG CTAATTCATA GTTCTATTGT ATCACTTGGT CAGAAATAAT CAAGAAAAAA 
GTCTGACTTT CTCAAGATAA AAAGCCTGAG ACCAACTCAG ACTTTTTAAT TCTTAAAATG 
GCAATTCTTC CTCTTCCAAG ACCAAATCTG CCAAATCTTG GCCTGCATTA TTTTCACGCA 
TAGCACGTTG GGCACGACTT TCCAAGAGTT GGAATCCTGT GACAAGTACT TCGGTCACGT 
AGTTCATTTG GCCATTTTTC TCAAAGCGAC GGCTACGCAA TTCTCCATCA ACGGAAATGA 
GACTACCTTT GGTTGCGTAC TTGCCAAAGT TTCTGCTAGT CTGCCCCATA GGACCATATT 
GACAAAATCA GCTTCACGTT CACCGTTTTG GTCTTTGTAA CGACGGTTCA CACCGATACT 
TGCTCGCGCT ACCGACTTGT CATTGTTGGT TTTGTGCAAT TCTGGTGTAG ACGTTAAACG 
TCCAATCAAG ATAACTTTAT TATACATATT TTCTTCCTCC TACTTATCTA TTCGTAGG/A 
ATCAAAAAAA GTTACAGAAA TTTGTAACTT TTCGAGAAAA TTTTTTATTT TTTATGAACC 
ATCAAACCTG TCGCCTGTTG ATTGGCCATA ATGGTCATAT CTGTAATCTG AACACGACGA 
GGTTGACTAG TCACATAGAC TACTGTATCT GCAATATCCT GAGCTTGCAA AGCTTCTATT 
CCTTGGTAAA CGGACGCAGC TCGTTCTTTA TCACCATGAA AACGCACTGT AGAAAAATCT 
GTTTCGACAA TTCCAGGCTG AATGGTCGTC ACCTTGATAT CCGTTGCGAT GGTATCAATT 
CGCAGTCCAT CTGAAAAGGT CTTAACTGCC GCCTTGGTGG CTGAGTAAAC AGCTGCACCA 
GCATAGGCAT AAATTCCTGC GGTTGACCCC ATATTGATAA TATCACCTTG ATTGGCTTTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



WO 98/18931 



PCT/US97/19588 



325 

ACCATTGCTG GCAAGAAACA GCGAGTGACT GCCATCAAAC CTTTGACATT GGTATCCAAC 1020 

ATGGTCAGCA TAT CC AACTC TTCATAGTCT TGATAGGGAG CTAAGCCAAG AGCCAGTCCT 1080 
GCGTTATTGA CCAGGATGTC AATCTGACCT ATCGTTTCTA AAATATCAGA GCAGACAGTC ■ 1140 

TTTACCATTG TCATATCCGT GACATCTAGG AGAAAAGTCC AAACTGTTTG ATTTGGAAAA 1200 

GTTTCTGCAA ACTCCGCCTT AAGAGCTTCT AGTCTGTCTA TCCGTCGTCC TGTTAGAACG 1260 

ACATCCTCAC CCTGCTCCAG ATAAGCACGC GCAATCGCTT CACCGATTCC TGATGTCGCT 1320 

CCTGTAATCA CAACATTTTT TGCCATCTTA TTTCCTTCTA GCTGGTCTAT CAGATATTAA 1380 

CAACTTCTTA GGCAGTCCAG TGTTTCGCTG GGTCGAACGG TGTTCCGACA ACTTCGTCTT 1440 

CTGATAATTC AAGCACCCCA CGTTTTTGTG GAGCATTTGG CAGATGCAAT TCACGAGGAC 1500 

TGCACATCAT ACCAAAACTC TTTTCACCAC GAAGTTCACC TGGGAAAATG AGATTCCCTT 1560 

TTGGCATCAT AGCTCCAGGA AGCGCGACAA TGGTTTTCAA CCCCACACGC GCATTGGGAG 1620 

CTCCTGCAAC GATTTGTACA GTCTTATCAC TTGCGACTGC AACTTGGCAG ATGTTGAGGT 1680 

GGTCACTATC TGGATGGGCT ACCATCTCAA CAATTTCACC TACAACAAAC TTAGGTTCCT 1740 

TATCATTAAC AATTTCTTCT GTAAAACCTT CCGCCTGCAA CTCTTGGTTC AAACGAGCGA 1800 

CTTGCTCATC TGTCAAAAAG ACTTGACCGC GCTCTGCAAT TTCAAATAAA CTTGAAACTT 1860 

CGAAAATATT CCAAGCCACT GTTTCCCCAT TATCTTTGAG AAAAACACGG GCTACCTTGC 1920 

CTTTGCGCTC CACATCCAGT TTGGCATCTC CGCTATTTTT CACGATGACC ATAAGGACAT 1980 

CACCGACATG TTCTTTATTA TATGTAAAAA TCATTGTTTC CTTTTTCTCC TATTTCAGTC 204 0 

CTGCTAAAAA GTCATTGATT TGTTGCTTGC TTTTACGGTC GCGATTGACA AAACCACCGA 2100 

TTTCCTTGTC CTTTTCTAGA ACAACAAGGC TAGGAATTCC GTAAACATCC CAGAGTTTGG 2160 

CCAAATCCAT ATACTGATCT CGGTCCATTC GAATAAAGGT GAACTCTGGA TTGGTCTCCT 2220 

CAATCTCTGG TAAGGCAGGA TAAATATAAC GACAATCGCT ACACCAGTCT GCCACAAAAA 2280 

TGAAGACCTT CTTGCCCGCT TTTTCCACTA AAGATGCTAA TTCTTCTAAA CTTGCTGGCT 2 340 

GTATCATAAG ACTTCCTCCT CATAGACTAG GTCTTCATTT TCATAGACAA AGGTATAATG 2400 

ACGGCCATCC TCAAAAATGA CGCCACCAAC CAAGCTCTCC AGACTGCTTT CGTAAACTTG 2460 

AACATAAAGG GTCGCAATTT CCCCCATGTC GGAAAAATGG TCTCGCACAA TCTCTGTCAA 2520 

CTCTTCCTGA GTCTTCATGA GCTTACGGTC ATCTGCAACT TTTTTCGTAG CAAGAGCAAG 2580 

G CTTCCGAT A CCTAGCAGAG CCAAGCCTGC CATCCACATT TTTTTAGCTT TCATACCATT 2 640 

CATTTTAACA CAAAAAAGGC TTCAGGACAA ATGAGGAAGC AGCAGAAAAG CAAGTAAAAA 2700 



WO 98/18931 



PCTYUS97/19588 



326 

GCCTCTTCCT TTAAGGAAAA GGACTTCTTA TACTCAATGA AAATCAAAGA CCAAACTAGG 
AAGCTAGCCG CAGGCTGCTC AAAGCACTGC TTTGAGGTTG TAGATAGAAC TGACGAgTCa 
CTCAAAACAC TGTTTTGAGG TTGTGGATGA AGCTGACGTG CTTTGAAGAG ATTTTCGAAG 
AGTATTATTC TTATTGCCAG GCACCTAAGT TGCCAACGTA GTAACTATCA GGTGTGTAGG 
TATTGCGAGC ATCTTACCTG ATGAAGCCAG ATAATACTAC TTGCCATTGT CTTTGACCCA 
ATCATTCGCA ATCATGGAAC CAGAAGAACT TACATAATAC CATTCTCCCT TGTCATAAAC 
CCAAGTACTG ACTTTCATGG TTCCTGAGCA ATTAAAGGCA AAAAAACTGT CCAATAACAT 
TCGTTTTTTA AAAGCATTTG ACACTACAT 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1024 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



2760 
2820 
2880 
2940 
3000 
3060 
3120 
3149 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CCAAAAATTC AACCTTTAAG GGGAGTCCAG AGAGACTCAC AACCTCTCAG ATAAAAGAAT 
GGTGCAATTT TCTAGAGGAG ACTTTTTGAG TGTGCTCTCT TGTGTTGTAC GATTTTAACT 
GAGGCCTTGC ACTAGCAAGG TCTTTTCTTT ATCTGGTCCC CTTAAAATTT AAGGAGGAAA 
AGTTATGAAT CCCACATGTA AGAAGCGTTT GGGTGTCATT CGGTTGGAAA CCATGAAGGT 
CGTTGCACAA GAGGAAATCG CGCCACAATC TTTGAATTAG TCCTAGAAGG AGAAATGGTT 
GAAGCCATGC GAGCAGGCCA ATTTCTTCAT CTGCGTGTAC CGGACGATGC CCATCTCTTA 
CGTCGTCCTA TTTCAATTTC GTCTATTGAC AAGGCAAACA AGCACTGTCA CCTCATTTAT 
CGGATTGACG GAGCTGGGAC TGCAATTTTT TCAACCTTAA GTCAGGGAGA CACTCTTCAT 
GTGATGGGGC CTCAGGGAAA TGGTTTTGAC TTGTCTGACC TTGATGAGCA GAATCAGGTT 
CTCCTTGTTG GTGGTGGGAT TGGTGTTCCA CCCTTGCTTG AGGTGGCCAA GGAATTGCAT 
GAACGTGGAG TGAAAGTAGT GACAGTCCTC GGTTTTGCTA ATAAGGATGC TGTTATTTTG 
AAAACGGAAT TGGCTCAGTA TGGTCAGGTC TTTGTAACGA CAGATGATGG TTCTTATGGC 
ATCAAGGGAA ATGTTTCCGT TGTTATCAAT CATTTAGACA GTCAGTTTGA TGCTGTTTAC 
TCGTGTGCGG CTCCAGGAAT GATGAAGTAT ATCAATCAAA CCTTTGATGA TCACCCAAGA 
GCCTATTTAT CTCTGGAATC TCGTATGGCT TGTGGGATGG GAGCTTGCTA TGCCTGTGTT 
CTAAAAGTAC CAGAAAACGA GACGGTCAGC CAACGCGTCT GTGAAGATGG TCCTGTTTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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CGCACAGGAA 


CAGTTGTATT 


ATAAGGAGAA 


AATTATCACT 


ACAAATCGAT 


TACAAGTTTC 


1020 


TCTACCTGGT 


TTGGATTTGA 


AAAATCCGAT 


TATTCCAGCA 


TCAGGCTGTT 


TTGGCTTTGG 


1080 


ACAAGAGTAT 


GCCAAGTACT 


ATGATTTAGA 


CCTTTTAGGT 


TCTATTATGA 


TCAAGGCGAC 


1140 


AACCCTTGAA 


CCACGTTTTG 


GGAATCCAAC 


TCCAAGAGTG 


GCAGAGACGC 


CTGCTGGTAT 


1200 


GCTCAATGCA 


ATTGGCTTGC 


AAAATCCTGG 


TTTAGAGGTT 


GTTTTGGCTG 


AAAAGCTACC 


1260 


TTGGCTGGAA 


AGAGAATATC 


CAAATCTTCC 


TATTATTCCC 


AATGTAGCTG 


GTTTTTCAAA 


1320 


ACAAGAGTAT 


GCAGCTGTTT 


CTCATGGGAT 


TTCCAAGCCA 


ACTAATGTAA 


AAGCTATCGA 


1330 


GCTCAATATT 


TCTTGTCCCA 


ATGTTGACCA 


CTGTAATCAT 


GGACTTTTGA 


TTGGTCAAGA 


1440 


TCCAGATTTG 


GCTTATGATG 


TGGTGAAAGC 


AGCTGTGGAA 


GCCTCAGAAG 


TGCCAGTTTA 


1500 


TGTCAAATTA 


ACCCCGAGTG 


TGACCGATAT 


CGTTACTGTC 


GCAAAAGCTG 


CACAAGATGC 


1560 


GGGAGCAAGT 


GGCTTGACCA 


TGATCAATAC 


TCTGCTTGGA 


ATGCGCTTTG 


ACCTCAAAAC 


1620 


TAGAAAACCA 


ATCTTGGCCA 


ATGGAACAGG 


TGGAATGTCT 


GGTCCAGCAG 


TCTTTCCAGT 


1680 


AGCCCTCAAA 


CTCATCCGCC 


AAGTTGCCCA 


AACAACAGAC 


CTGCCTATCA 


TTGGAATGGG 


1740 


AGGAGTGCAT 


TCGGCTGAAG 


CTGCCCTAGA 


AATGTATCTG 


GCTGGGGCAT 


C7GCTATCGG 


1300 


AGTTGGAACA 


GCTAACTTTA 


CCAATCCTTA 


TGCCTGCCCT 


GACATCATCG 


AAAATTTACC 


1860 


AAAAGTCATG 


GATAAATACG 


GTATTAGCAG 


TCTGGAAGAA 


CTCCCTCAGG 


AAGTAAAAGA 


1920 


CTCTCTGACC 


TAAACTGCAA 


TCAATCTGTT 


CTTGATTTTT 


TATTAGTTTG 


TAATATGAAT 


1980 


TTAGGAGAAT 


TTTGGTACAA 


TAAAATAAAT 


AAGAACAGAG 


GAAGAAGGTT 


AATGAAGAAA 


2040 


GTAACATTTA 


TTTTTTTAGC 


TCTCCTATTT 


TTCTTAGCTA 


GTCCAGAGGG 


TGCAATGGCT 


2100 


AGTGATGGTA 


CTTGGCAAGG 


AAAACAGTAT 


CTGAAAGAAG 


ATGGCAGTCA 


AGCAGCAAAT 


2160 


GAGTGGGTTT 


TTGATACTCA 


TTATCAATCT 


TGGTTCTATA 


TAAAAGCAGA 


TGCTAACTAT 


2220 


GCTGAAAATG 


AATGGCTAAA 


GCAAGGTGAC 


GACTATTTTT 


ACCTCAAATC 


TGGTGGCTAT 


2280 


ATGGCCAAAT 


CAGAATGGGT 


AGAAGACAAG 


GGAGCCTTTT 


ATTATCTTGA 


CCAAGATGGA 


2340 


AAGATGAAAA 


GAAATGCTTG 


GGTAGGAACT 


TCCTATGTTG 


CTGCAACAGG 


TGCCAAAGTA 


2400 


ATAGAAGACT 


GGGTCTATGA 


TTCTCAATAC 


GATGCTTCGT 


TTTATATCAA 


AGCAGATGGA 


24 60 


CAGCACGCAG 


AGAAAGAATG 


GCTCCAAATT 


AAAGGGAAGG 


ACTATTATTT 


CAAATCCGGT 


2520 


GGTTATCTAC TGACAAGTCA GTGGATTAAT 


CAAGCTTATG 


TGAATGCTAG 


TGCTGCCAAA 


2580 


GTACAGCAAG 


GTTGGCTTTT 


TGACAAACAA 


TACCAATCTT 


GGTTTTACAT 


CAAAGAAAAT 


2640 


GGAAACTATG 


CTGATAAAGA 


ATGCATTTTC 


GAGAATGGTC 


ACTATTATTA 


TCTAAAATCC 


2700 
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GGTGGyTACA TGCCAGCCAA TGAATGGATT TGGGATAAGG AATCTTGGTT TTATCTCAAA 27 60 

TyTGATGGGA AAATrGCTGA AAAAGAATGG GTCTACGATT CTCATAGTCA AGCTTGGTAC 2820 

TACTTCAAAT CCGGTGGTTA CATGACAGCC AATGAATGGA TTTGGGATAA GGAATCTTGG 2880 

TTTTACCTCA AATCTGATGG GAAAATAGCT GAAAAAGAAT GGGTCTACGA TTCTCATAGT 2940 

CAAGCTTGGT ACTACTTCAA ATCTGGTGGC TACATGGCGA AAAATGAGAC AGTAGATGGT 3000 

TATCAGCTTG GAAGCGATGG TAAATGGCTT GGAGGAAAAA CTACAAATGA AAATGCTGCT 3060 

TACTATCAAG TAGTGCCTGT TACAGCCAAT GTTTATGATT CAGATGGTGA AAAGCTTTCC 3120 

TATATATCGC AAGGTAGTGT CGTATCGCTA GATAAGGATA GAAAAAGTGA TGACAAGCGC 3180 

TTGGCTATTA CTATTTCTGG TTTGTCAGGC TATATGAAAA CAGAAGATTT ACAAGCGCTA 3240 

GATGCTAGTA AGGACTTTAT CCCTTATTAT GAGAGTGATG GCCACCGTTT TTATCACTAT 3300 

GTGGCTCAGA ATGCTAGTAT CCCAGTAGCT TCTCATCTTT CTGATATGGA AGTAGGCAAG 33 60 

AAATATTATT CGGCAGATGG CCTGCATTTT GATGGTTTTA AGCTTGAGAA TCCCTTCCTT 3420 

TTCAAAGATT TAACAGAGGC TACAAACTAC AGTGCTGAAG AATTGGATAA GGTATTTAGT 3480 

TTGCTAAACA TTAACAATAG CCTTTTGGAG AACAAGGGCG CTACTTTTAA GGAAGCCGAA 3540 

GAACATTACC ATATCAATGC TCTTTATCTC CTTGCCCATA GTGCCCTAGA AAGTAACTGG 3 600 

GGAAGAAGTA AAATTGCCAA AGATAAGAAT AATTTCTTTG GCATTACAGC CTATGATACG 3660 

ACCCCTTACC TTTCTGCTAA GACATTTGAT GATGTGGATA AGGGAATTTT AGGTGCAACC 3720 

AAGTGGATTA AGGAAAATTA TATCGATAGG GGAAGAACTT TCCTTGGAAA CAAGGCTTCT 3780 

GGTATGAATG TGGAATATGC TTCAGACCCT TATTGGGGCG AAAAAATTGC TACTGTCATC 3 840 

ATGAAAATCA ATGAGAAGCT AGGTGGCAAA GATTAGTACT ATAAGTGAAT ATGATTTGAG 3900 

TGAATAGTAA GTTAAAAATC CTGATTTCAA GTAAAATCAG GATTTTTTCA TGGATGCAAT 3960 

TTTTTTGGAG TCTGGTGTGA CGCGGAGGGT CTTTTGTCCT GTGTAAGTGA CAAAGCCGGG 4020 

TTTTCCACCA GTTGGTTTAT TGAGTTTTTT GACTTCAATC ATATCTACCT GCACCAGA1T 4080 

CGACACGCGC CCTTGAGAGA AGTAGGCAGC TAACTCTGCT GCGTCTGTCT TGACTCCATC 4140 

AGATGGGTCA AGATTTCCTG AGATGACAAC ATGGCTTCCA GGAATGTCCT TAGCATGGAA 4200 

CCAAAGTTCC TCCTTGCGGG CCATTTTAAA GGTCAATTCC TCATTTTGAA GATTGTTTCG 4260 

TCCGACATAG ATGATGGTTT TGCCATCGCT TGCTAGATAT TGTTCTAGTT TTTTGCGTTT 4 320 

CTGGATTTTC TCCCGTTGTC TTCTGCGGAT AAAACCTGTT TGAATCAATT CTTCACGGAT 4 380 

TTCAGCGATT TCTTCCAGTC CAGCTTGGTT GAGGACGGTT TCTACACTTT CCAGATAGAG 4440 

AATAGTGGCT TTGGTTTCTT CAATCAAATC AGTCAAGTAT TTGACAGCTT CTTTGAGTTT 4500 
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CTGATACCGT TTAAAATAGC GTTGGGCATT CTGGTTGGGA GTCAGAGCCT TATCAAGCGC 4560 

AATCATGATA GGTTGGTTGG TATAGTAGTT GTCTAGGATA ACCTGGTCTT GGTCGTTAGG 4620 

CACTTGGTGG AGGAAGGTTG TCAGCAATTC TCCTTTTTGA CGAAATTCTT CAGCGTTGTC 4 680 

TGTCGCCAGT AACTCTTTTT CCTGTTTTTT GAGTTTGTGT CGGTTTTTCT GAAGTTCATT 4740 

TTCAACACGA CGAATCAGTT CACTGGCCTG CTGTTTGACG CGGTCGCGCT CAGCCTTATC 4800 

CTTATAGTAG GTGTCCAACA AATCAGAAAG ATTTGCAAAA GGCTCTCCCA CCTGATTTGC 4860 

AAAAGGAACT GGACTGAAGG AAGTCTCAGT CAAGCATGGC TTGGTTTCTT GATTGAAAAA 4920 

ATTTCGGAAA GCGGAAAGTT TTTCACTAAC CAGTATCCTT TCCAATTCAT TTGCCGTATC 4980 

GCGTCCCAGA CCTTGAAAGA GGCTTTGAAG ATTTTTTGCT GTTAGTTCTT GGGTTTGCAG 5040 

GATTTCAAAG AGCTTTTCAT CCTTGATAGT AAAAGGATTG AGAGATTTTG TACTTGGCGG 5100 

AGCGATATAG GTCGATCCTG GAAGTAAGGT GCGGTAGCTA TTTTGTGAAA AGCCGACGTG 5160 

TTTG ATAACT TCGAGGATTT TATGACTGCT TTTATCGACC AGTAGAATAT TACTGTGTTT 5220 

CCCCATAATT TCGATAATCA AGGTAGCCTG GATATGGTCT CCAATCTCGT TTTTATTGGA 5280 

AACTGTAATT TCCACAATAC GGTCATTTTC CACTTGCTCA ATCGACTCAA TCAGGGCCCC 5340 

CTGCAAATAC TTTCTCAAAA CCATGATAAA GGTAGAAGGT TGAGCTGGAT TTTCAAAAGT 5400 

CGTTTGGGTC AGCTGAATGC GTCCAAAAAC TGGATGCGCA GAAAGGAGCA GGCGATGGCT 5460 

TTGGCGATTG CTGCGGATTT GCAAGACCAA CTCTTGTTCA AAAGGCTGAT TGATTTTCTG 5520 

GATCCGACCA TTCACTAATT CGCTTCGCAA TTCCTCAACT ATGTGGTGTA AAAAAAATCC 5580 

GTCAAATGAC ATCGTTCTCT CCTTGTGATT GTATTCCATA GTATTATATC AAAAAGGTAG 5640 

AATAAAATCA TGGAAATCTG GTATAATAAA GCCAAGTAAA GAGAAACGAG AAGCACATGT 5700 

ATATTGAAAT GGTAGATGAA ACTGGTCAAG TTTCAAAAGA AATGTTGCAA CAAACCCAAG 5760 

AAATTTTGGA ATTTGCAGCC CAAAAATTAG GAAAAGAAGA CAAGGAGATG GCAGTCACTT 5820 

TTGTGACCAA TGAGCGTAGT CATGAACTTA ATCTGGAGTA CCGTAACACC GACCGTCCGA 5880 

CAGATGTCAT CAGCCTTGAG TATAAACCAG AATTGGAAAT TGCCTTTGAC GAAGAGGATT S940 

TGCTTGAAAA TTCAGAATTG GCAGAGATGA TGTCTGAGTT TGATGCCTAT ATTGGGGAAT 6000 

TGTTCATCTC TATCGATAAG GCTCATGAGC AGGCCGAAGA ATATGGTCAC AGCTTTGAGC 6060 

GTGAGATGGG CTTCTTGGCA GTACACGGCT TTTTACATAT TAACGGCTAT GATCACTACA 6120 

CTCCGGAAGA AGAAGCGGAG ATGTTCGGTT TACAAGAAGA AATTTTGACA GCCTATGGAC 6180 

TCACAAGACA ATAAACGAAA ATGGAAAAAT CGTGACTTGA TATCCAGTTT AGAATTTGCT 6240 
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TTGACAGGTA TTTTTACTGC TATCAAGGAA GAACGCAATA TGeSAA-^CA CGCAGTGACG 

GCTCTAGTGG TCATCCTTGC AGGTTTTGTT TTTCACGTGT CACGAATCGA ATGGCTCTTT 

CTCCTATTCA GTATTTTCTT GGTAGTAGCC TTTGAGATTA TCAACTCTGC TATTGAAAAT 

GTGGTGGATT TGGCCAGTCA CTATCACTTT TCCATGCTGG CTAAAAATGC CAAGGATATG 

GCGGCCGGCG CGGTATTACT GGTTTCTCTT TTCGCAGCCT TAACAGGCGC ATTGATTTTT 

CTCCCACGAA TCTGGGATTT ATTATTTTAA ACAGTAAGAG GAAATTATGA CTTTTAAATC 

AGGCTTTGTA GCCATTTTAG GACGTCCCAA TGTTGGGAAG TCAACCTTTT TAAATCACGT 

TATGGGGCAA AAGATTGCCA TCATGAGTGA CAAGGCGCAG ACAACGCGCA ATAAAATCAT 

GGGAATTTAC ACGACTGATA AGGAGCAAAT TGTCTTTATC GACACACCAG GGATTCACAA 

GCCTAAAACA GCTCTCGGAG ATTTCATGGT TGAGTCTGCC TACAGTACCC TTCGCGAAGT 

GGACACTGTT CTTTTCATGG TGCCTGCTGA TGAACCGCGT GGTAAGGGGG ACGATATGAT 

TATCGAGCGT CTCAAGGCTG CCAACGTTCC TCTGATTTTG GTCGTGAATA AAATCGATAA 

GGTCCATCCA GACCAGCTCT TGTCTCAGAT TGATGACTTC CGTAATCAAA TGGACTTTAA 

GGAAATTGTT CCAATCTCAG CCCTTCAGGG AAATAACGTG TCTCGTCTAG TGGATATTTT 

CAGTCAAAAT CTGGATGAAG GTTTCCAATA TTTCCCGTCT CATCAAATCA CAGACCATCC 

AGAACGTTTC TTGGTTTCAG AAATGGTTCG CGAGAAAGTC TTGCACCTAA CTCGTGAAGA 

GATTCCGCAT TCTGTAGCAG TAGTTGTTGA CTCTATGAAA CGAGACGAAG AGACAGACAA 

GGTTCACATC CGTGCAACCA TCATGGTCGA GCGCGATAGC CAAAAAGGGA TTATCATCGG 

TAAAGGTGGC GCTATGCTTA AGAAAATCGG TAGCATGGCC CGTCGTCATA TCGAACTCAT 

GCTAGGAGAC AAGGTCTTCC TAGAAACCTG GGTCAAGGTC AAGAAAAACT GGCGCGATAA 

AAAGCTAGAT TTGGCTGACT TTGGCTATAA TGAAAGAGAA TACTAAGTAG AGGTAGGCTC 

ATGCCTGCTT CTTGTTTTTA CAGAAGGAGG ACTTATGCCT GAATTACCTG AGGTTGAAAC 

CGTTTGTCGT GGCTTAGAAA AATTGATTAT AGGAAAGAAG ATTTCGAGTA TAGAAATTCG 

CTACCCCAAG ATGATTAAGA CGGATTTGGA AGAGTTTCAA AGGGAATTGC CTAGTCAGAT 

TATCGAGTCA ATGGGACGTC GTGGAAAATA TTTGCTTTTT TATCTGACAG ACAACCTCTT 

GATTTCCCAT TTGCGGATGG AGGGCAAGTA TTTTTACTAT CCAGACCAAG GACCTGAACG 

CAAGCATGCC CATGTTTTCT TTCATTTTGA AGATGGTGGC ACGCTTGTTT ATGAGGATGT 

TCGCAAGTTT GGAACCATGG AACTCTTGGT GCCTGACCTT TTAGACGTCT ACTTTATTTC 

TAAAAAATTA GGTCCTGAAC CAAGCGAACA AGACTTTGAT TTACAGGTCT TTCAATCTGC 

CCTTGCCAAG TCCAAAAAGC CTATCAAATC CCATCTCCTA GACCAGACCT TGGTAGCTGG 



6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

804 0 
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ACTTGGCAAT ATCTATGTGG ATGAGGTTCT CTGGCGAGCT CAGGTTCATC CAGCTAGACC 8100 

TTCCCAGACT TTGACAGCAG AAGAAGCGAC TGCCATTCAT GACCAGACCA TTGCTGTTTT 8160 

GGGCCAGGCT GTTGAAAAAG GTGGCTCCAC CATTCGGACT TATACCAATG CCTTTGGGGA 8220 

AGATGGAAGC ATGCAGGACT TTCATCAGGT CTATGATAAG ACTGGTCAAG AATGTGTACG 8280 

CTGTGGTACC ATCATTGAGA AAATTCAACT AGGCGGACGT GGAACCCACT TTTGTCCAAA 8340 

CTGTCAAAGG AGGGACTGAT GGGAAAAATC ATCGGAATCA CTGGGGGAAT TGCCTCTGGT 8400 

AAGTCAACTG TGACAAATTT TCTAAGACAG CAAGGCTTTC AAGTAGTGGA TGCCGACGCA 84 60 

GTCGTCCACC AACTACAGAA ACCTGGTGGT CGTCTGTTTG AGGCTCTAGT ACAGCACTTT 8520 

GGGCAAGAAA TCATTCTTGA AAACGGAGAA CTCAATCGCC CTCTCCTAGC TAGTCTCATC 8580 

TTTTCAAATC CTGATGAACG AGAATGGTCT AAGCAAATTC AAGGGGAGAT TATCCGTGAG 8640 

GAACTGGCTA CTTTGAGACA ACAGTTGGCT CAGACAGAAG AGATTTTCTT CATGGATATT 87 00 

CCCCTACTTT TTGAGCAGGA CTACAGCGAT TGGTTTGCTG AGACTTGGTT GGTCTATGTG 8*7 60 

GACCGAGATG CCCAAGTGGA ACGCTTAATG AAAAGGGACC AGTTGTCCAA AGATGAAGCT 8820 

GAGTCTCGTC TGGCAGCCCA GTGGCCTTTA GAAAAAAAGA AAGATTTGGC CAGCCAGGTT 8880 

CTTGATAATA ATGGCAATCA GAACCAGCTT CTTAATCAAG TGCATATCCT TCTTGAGGGA 8940 

GGTAGGCAAG ATGACAGAGA TTAACTGGAA GGATAATCTG CGCATTGCCT GGTTTGGTAA 9000 

TTTTCTGACA GGAGCCAGTA TTTCTTTGGT TGTACCTTTT ATGCCCATCT TCGTGGAAAA 9060 

TCTAGGTGTA GGGAGTCACC AAGTCGCTTT TTATGCAGGC TTAGCAATTT CTCTCTCTCC 9120 

TATTTCCGCG GCGCTCTTTT CTCCTATTTG GGGTATTCTT GCTGACAAAT ACGGCCGAAA 9180 

ACCCATGATG ATTCGGGCAG GTCTTGCTAT GACTATCACT ATGGGACGCT TGGCCTTTGT 9240 

CCCAAATATC TATTGGTTAA TCTTTCTTCG TTTACTAAAC GCTGTATTTC CAGGTTTTGT 9300 

TCCTAATCCA ACGGCACTGA TAGCCAGTCA GGTTCCAAAG GAGAAATCAG GCTCTGCCTT 9360 

AGGTACTTTG TCTACAGGCG TAGTTGCAGG TACTCTAACT GGTCCCTTTA TTGGTGGCTT 94 20 

TATCGCAGAA TTATTTGGCA TTCGTACAGT TTTCTTACTG GTTGGTAGTT TTCTATTTTT 94 80 

AGCTGCTATT TTGACTATTT GCTTTATCAA GGAAGATTTT CAACCAGTAG CCAAGGAAAA 9540 

GGCTATTCCA ACAAAGGAAT TATTTACCTC GGTTAAATAT CCCTATCTTT TGCTCAATCT 9600 

CTTTTTAACC AGTTTTGTCA TCCAATTTTC AGCTCAATCG ATTGGCCCTA TTTTGGCTCT 9660 

TTATGTACGC GACTTAGGGC AGACAGAGAA TCTTCTTTTT GTCTCTGGTT TCATTGTGTC 9720 

CAGTATGGGC TTTTCCAGCA TGATGAGTGC AGGAGTCATG GGCAAGCTAG GTGACAAGGT 9780 
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GGGCAATCAT CGTCTCTTGG TTGTCGCCCA GTTTTATTCA GTCATCATCT ATCTCCTCTG 9840 

TGCCAATGCC TCTAGCCCCC TTCAACTAGG ACTCTATCGT TTCCTCTTTG GATTGGGAAC 9900 

CGGTGCCTTG ATTCCCGGGG TTAATGCCCT ACTCAGCAAA ATGACTCCCA AAGCCGGCAT 9960 

TTCGAGGGTC TTTGCCTTCA ATCAGGTATT CTTTTATCTG GGAGGTGTTG TTGGTCCCAT 10020 

GGCAGGTTCT GCAGTAGCAG GTCAATTTGG CTACCATGCT GTCTTTTATG CGACAAGCCT 10080 

TTGTGTTGCC TTTAGTTGTC TCTTTAACCT GATTCAATTT CGAACATTAT TAAAAGTAAA 10140 

GGAAATCTAG TGCGAGTAAA AATCAATCTC AAATCCTCCT CTTGTGGCAG TATCAATTAC 10200 

CTAACCAGTA AAAATTCAAA AACCCATCCA GACAgATTGA 10240 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13206 base pairs 

(B) TYPE: nucleic acid 
(C> STRAND EDNESS : double 
(D> TOPOLOGY: linear 



<xij SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CGCTTTATCG TGGACGTGGT CAAGCCGAGA ATTTCATCAA GGAGATGAAG GAGGGATTTT 60 

TTGGCGATAA AACGGATAGT TCAACCTTAA TCAAAAACGA AGTTCGTATG ATGATGAGCT 120 

GTATCGCCTA CAATCTCTAT CTTTTTCTCA AACATCTAGC TGGAGGTGAC TTCCAAACTT 180 

TAACAATCAA ACGCTTCCGC CATCTTTTTC TTCACGTGGT GGGAAAATGT GTTCGAACAG 240 

GACGCAAGCA GCTCCT C AAA TTGTCTAGTC TCTATGCCTA TTCCGAATTG TTTTCAGCAC 300 

TTTATTCTAG GATTAGAAAA GTCAACCTGA ATCTTCCTCT TCCTTATGAA CCACCTAGAA 360 

GAAAAGCGTC GTTAATGATG CATTAAAGAA CAGTCGAGAT GAAAAAATCG TGTGACGCAC 420 

CAAGGGAGGA GTCTGCCCTT TTGAGGAAAT CTAGCGAGGA AAAACGATAC TGGAACAGCA 480 

GAAAGTAAAA CTGACCTCAT GAGGAGGAAG AAAGTGGCTC ATGAGGTCAG GGGTTTTG7A 540 

AGTTACATCT AGTTGAGAGA GGTATGAATG ATTTGGGATT AATCATTTCT TGTTTTAAAT 600 

CAGGAGAATA GTAACGATTT TTTCCTTTTT TGACGAACTC TATTCCGTAA CGATCAATCA 660 

ATTTAATCAT GTACCTAATA TTAGAATTGT TTATCCCAAA TTTATTTGAA AGCTTCTCTA 720 

AGCTATATCC TTGTTTTCTA AGTTCATAGA TCTGAACTTT ATCATCATAA GTTAGTTTCA 780 

TAATAAAAAC ACCCCAAAAG TTAGATTTTT TCTGTCTAAC TTTTGGGGGG CAGTTCATTC 840 

AACACCTGAT ACTATGCGTT TTTCTTATTT GAAATACTTT TTACTCAACC TCTTTATACT 900 

CAATGAAAAT CAAAGTGCAA ACTAGAAAGC TAGCCTCAGG CTGCTCAAAA CAGTGTTTTG 960 
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AGGTTGCAGA 


TGGAAGCTGA 


CGTGGTTTGA 


AGAGATTTTC 


GAAGAGTATT 


ACTTAATCTT 


1020 


CTTGATACTT 


TGACTAAGAA 


TAAATCCTAC 


AATCATCCCT 


ACCATATTTT 


GCATAAAATT 


1080 


CGGTAGAATT 


TCTGGGAGGG 


CTGCTGCCCA 


GCCATTCATC 


AAAGCAGAAC 


CCAAGGCGTA 


1140 


GCCTCCTACC 


ATGGCAATAG 


TTGCTAAAAT 


AAGGCCTAAC 


CACTGACTTT 


TTCCTTTAAA 


1200 


TCCTGCGAAA 


AATCCCTGCA 


AGCCATGGTT 


GACCAAGCTA 


AAGAACATCC 


ACTGAGGGTA 


1260 


GCCTGATAAG 


AGGTCAATCA 


AGAAACTTGC 


TAGTCCTCCG 


ACTACCGCTC 


CTTCACGACT 


1320 


ACCAAAGTAA 


AAGGCCGCAA 


AGAAGACACC 


AGCATCTAAA 


AGAGTTAGAA 


TTCCTGTAGG 


1380 


TGTTGGGATT 


TTTAAGAAAT 


AACCTAGAAC 


CACAGAAAGG 


GCCGTTAATA 


GGGATACAAG 


1440 


GGCGATTTTA 


GTTGTTTTTG 


TTTGCTTCAT 


ATTGTCTTAC 


TCCATACTGA 


TCTGCTTGTG 


1500 


CAATAGCACG 


ATAAACGAAA 


GCCTTAGAGC 


TTTCTACTGC 


TGGCAAAAGT 


TTATCACCTT 


1560 


TAACCAGGTG 


ACTGGCAATG 


CTAGAGsCAA 


AGGTACAACs 


TGCACCAGCA 


TTTTGGCCTT 


1620 


GGATAACTGG 


ATTTTCTAGG 


ATAGTAAAGG 


TCTGTCCATC 


ATAAAAGACA 


TCCACAGCCT 


1680 


TGTCCTGACT 


AAGACGATTG 


ccrcccrrGA 


TAATGACTCC 


GGCGCTCCTA 


AATCATGCAA 


1740 


TTTCTGCGCT 


GCAGTTTTCA 


TGTCTTCCAA 


GGTTTTAATT 


TCCTGACCGG 


ATAATAATTC 


1800 


TGCTTCTGGG 


AGATTAGGCG 


TAATCACACT 


GACATAAGGG 


AAAAAGCGAA 


TCAACTCTTG 


1860 


GCAGAGCTCA 


CTGACAGCTA 


CATCATGCGT 


TTCCTTGCAG 


ACCAAGACAG 


GATCCAACAC 


1920 


CACAGGTACT 


CCTGGGCGTT 


GTTTGATAAA 


GTCCAAGGCC 


TTCTCAGCCA 


CCCTGACAGT 


1980 


ACGGAGAAGA 


CCAATCTTAA 


TTCCCCCAAA 


TTCCACATCA 


CGCAAGCTAT 


CTAATTCATC 


2040 


TTGAAAAATG 


GTATCATCAG 


TTGGAAAGAC 


TTCAAATCCT 


TTTTCTGTCA 


ACGCTGTCAA 


2100 


ArX^GTCACT 


GCTACAAACC 


CATGCAAGCC 


GTTCAAGGT.\ 




AA T CAGCTGA 


2160 


CAGTCCXCCA 


CCACTAAAAA 


TATCATTTCC 


AGAAAGTGCT 


AAAATACGAT 


TATTCTTCAT 


2220 


AACGAATCTC 


CTTTAAATAC 


AAACCATTTG 


GTGCTGCAGT 


GGGACCTGCA 


AGTTGCCTGT 


2280 


CCTTCTTCTC 


CAAGATGAGA 


TCAATCTGCT 


CTACTGGCAT 


GCGGTTGTTA 


CCGATTTTGA 


2340 


GAAGAGTCCC 


CACCATATTG 


CGAATCTGTT 


TATACAAGAA 


ACCATTTCCT 


GAAAAGGTAA 


2400 


AGGTCAAAAA 


TTGTCCTGTC 


TCATCGACTA 


TTAAACTAGC 


TTCTGTGATG 


GTGCGAACCT 


2460 


TATCCTCTAC 


ACTAGTCCCA 


GAGGCTGTAA 


AACCGGTAAA 


ATCATGGGTT 


CCCTCTAGCT 


2520 


TTTTGATTGC 


AATCTGCATT 


CGTTCCACAT 


CGAGTGGGTA 


GGGAAAGTGG 


GTGGCATAGT 


2580 


GACGGCGCAT 


CGGATTTTTG 


GGACGTCCTC 


TATCCACAGT 


AAACTCATAG 


GTCTTGCTAT 


2640 


GCTTGGCATA 


ACGGCAATGA 


AAATCATCTG 


CCACAAGCTC 


AATCGAAATC 


ACATCAATAT 


2700 
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CTTCAGGAGA 


CTGGGTATCP: 


AAGGCAAAAC 


GGAGTTTCTC 


CTC ATC CATC 


TGATAAGGCA 


2760 


GGTCAAAATG 


A ATP A CCTGT 


CCCAGGGCAT 


GAACCCCACT 


ATCTGTCCTA 


CCAGCACCGT 


2820 


P.AACAGTAAT 


p^PTmcp pt 


TTATTTAATC 


TGGTCAAGGT 


TTTTTCAATT 


TCTTCCTGAA 


2880 


l_VJ\. Jl /YV_ ljV_ VJV- 


ATfiAn'P.PTGG 


CGCTGAAAGC 


CAGCAAAGGC 


ATAACCATCA 


TAGGAAATAG 


2940 


TTP.C TTT ATA 
& 1 vx* 111 i 


TCTCGTCATA 


GCPTCTATTT 


TATCAAGAAA 


TTAGTCTGTA 


AACAAGGACC 


3000 


TAAAACAAAT 


ATrrrrATGGG 


TATAAAAATC 


T C AT ACPCTT 


CGAAAATCTC 


TTCAAACCAC 


3060 


Vj l V_ rvo 1 1 > v^V, 


atpttipaap p 


TP AAC AC ACT 


ATTTTGAGCA 


ACCTGCGGCT 


AGCTTTCTAT 


3120 






GAACAACTCT 


ATTAGGAAAG 


TCAAATTAAT 


TTCTAGAAAT 


3180 


Al I I I AuLAVj 




CTATTCCAAA 


CTC A ATC AAC 


TATAGTTTGC 


TCTTTGATTT 

L AAA * AAA 


3240 


lA_Al HjAIjIA 




AAPTTAfifiAA 
AAL 1 1 flwMA 


TPAATfPTAA 


p_PT PT PTTPT 


GAAGTAGGTA 


3300 

•J W w V 


V_ A I bALAAftu 


ATAPAPATTA 


PAATPAAPfA 


ACCTCCTAAG 


ATACTAAAGA 


CCAACATCCC 


3360 


A top/" TV* A 




TTPP&PPTAP. 


&ACrtAATfKtfi 


GTPGTAAAGG 


PTPCG AAACT 


3420 


AC At>CCTAA T 


AC AVjCAAA 1 *j 


AALt I 1\>C I I v» 


ATTCACnACT 


TTAP.PTGGAA 


TTCCTTCAGA 


3480 


Af* A A rvpm/^ A 


A iP iPPPHW 

AAtiALCO lT-u 


TPHAPAPTAP 


APT AT AHCC A 


AATPPAGCCA 


GAACACTTCC 


3540 


roCTACTACt. 


ACCCACAAOi* 


ATPAAPAPAA 


(V^PAATPAPf; 


ATTTGCCCCA 


AGCCAAAGGT 


3600 


AATACCAOAC 


^ A/"* A /V Af^/"* A 


Ij 1 1 1 A_ 1 V_ ill 


AAAP:ATAfIAA 


ATPAAGAAAC 


AAAAAjPTCAC 


3660 

J V* w W 


l_l_t_ At»t_v_Al_A 


A1*PPPT A TV* A 


iPTYlPJTPAT 


APTAAGAACA 


AAACTAGATA 


ACTGGGCATC 


3720 




Li 1 lf.wM.WA 


TPAAAPTPrai 


A AT APGGATG 


GTAATAGCTG 


TATTGGTACA 


3780 


AAL 1 AC AAl. 1 


PPPf P TTPn A 
L>CIA?C 1 1 Lun 


T APPTA 


AAAAATCAAG 


p PTTT C ATTT 


CTCGAGTTAA 


3840 


ik pp APTTfipr 




iiHiHiwiif'ifMrv^ AC 

l 1 1 < L i * VlAV, 


TTCTT PTTT 


GATTTTCCAT 

^# AAA* V« ■ » w 


AAGGGACAAA 


39C0 




Arrj"prAPPA 


C C A AAAA ""CP 




CCTAGAAAGA 


TAGCTGTCCA 


3960 




AAPAAPTP.AP 


CCACCnCCAA 


CaTPAATGAGA 


GAAGCTCCAA 


CGACCTCTGC 


4020 


AGAAGCCCGT 


AGCCCTAACA 


TCTGAATTCG 


CCTTTTTCCT 


TGGTAGCGTT 


CACTGATAAT 


4080 


AGAAATGGCC 


TTGGCATTGA 


TCATCCCAAG 


ACCCAAACCA 


AAGAGAAGCC 


CTGTTCCAAA 


4140 


GACAAAGGGA 


TAGGCTTGGT 


ACCAGAAGGG 


AGCTGTACCG 


CTCAATCATA 


AAATCAGCAA 


4200 


GCCCAAACTA 


ATCTGTAAGC 


GCTCAGGAAA 


TATTTTTTCT 


AAGAAACCAT 


TTAGCAGTAA 


4260 


CATCATCATG 


ATTCCAAACG 


AAGGCAAGCT 


CACCAAGAGC 


TCAATTTGTT 


CCTTAGAATA 


4320 


ACCCTGATAA 


TAGTCAAACA 


TGGCTGGTAG 


GGCACTCGAA 


ATGGAAAAGG 


AGGTAATCAA 


4380 


AACGAGGGAG 


AGAGCCAAAA 


TGCTGGCCCG 


TTCTAaAaAT 


TGTTTCATGA 


AATCTCTTTC 


4440 


TATATTTCTC 


TTAATCTTCT 


ACrrrrrTGA 


TAGTTATCAA 


ATAAGCAAGA 


AAAGAAGAAG 


4500 
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CCTCATTGGT 


TTGTAGACTC 


CTTCTTAAAT 


TCGAAAATGA 


ATCCCTTGTA 


TCTTATACTC 


4560 


AATGAAAATC 


AAAGAGCAAA 


CTAGGAAGCT 


AGCCGCAGGT 


TGTTCAAAAC 


AGTGTTTTGA 


4620 


GGTTGCACAT 


GGAAACTGAC 


GTGGTTTGAA 


GAGATTTTCG 


AAGAGTATTA 


GGATGACTTT 


4680 


CTCTTGATTT 


GCTTGATAAA 


GTAGAAAATA 


AATCCTGCTA 


CCATATAGGC 


AACAAAGATA 


4740 


ATCAGACACC 


ACTTAAACAC 


AACATTCCAA 


CCCTTGTTCA 


CATTCAAAAA 


GAAGTAAGGG 


4800 


AAAGGATTAT 


CCTTGGCATT 


TGGAATATTG 


AGTTTTAGAA 


CCAAGCCATT 


AAAAAGAGCA 


4860 


AACATCATAT 


ACAGAAAGGG 


TAAAATGGTC 


CACACTGCTG 


GATCCCAAAT 


CTTGTATTGA 


4920 


CCCTGTTTGT 


CAAAAAAGAG 


GGTATCCGCT 


AAAAACCAGA 


TGGGAACGAT 


ATAGTGGCAA 


4980 


AGGAAATTTT 


CTAGGGTATA 


GAAATTAGTC 


GCAATGGGCG 


CCAAGAGGAA 


ATGGTAAATC 


5040 


ACACAGGTAA 


TCATGATACT 


CATGGTGACC 


CCACCTTTTA 


AGCGCAAGAG 


ACTTGGCCTT 


5100 


TGCCAATTTT 


CACCTACACG 


GCTCATAACC 


TTTAGAAGAT 


AAAGGCTAAA 


AATAGTTACC 


5160 


AAGAGGTTGG 


ACAGAACCGT 


GTAATAGAGA 


ACCATCCCAA 


AACCACCATG 


CTTAGTAATT 


5220 


TCAAGATAAA 


CTCCCGTAAA 


AGCCGCTAGA 


AACAAGAACA 


TACGGCTATA 


AAATACAAGT 


5280 


TTATAGTGTT 


TTGACATGCT 


TAAATCTTCC 


TCACAAACTC 


TGATTTAACT 


TTCATGGCAC 


5340 


CAAAACCATC 


AATCTTACAG 


TCGATATTGT 


GGTCGCCTTC 


TACGATGCGG 


ATATTTTTCA 


5400 


CGCGCCTCCC 


TTGTTTCAAA 


TCTTTTGGCG 


CACCTTTTAC 


TTTCAAGTCC 


TTGATGAGAG 


5460 


TTACTGTATC 


ACCATCAGCC 


AATTTATTTC 


CGTTGGCATC 


GATAGCCACA 


AGACCTTCTT 


5520 


CTACTTCTGC 


AACTTCAGCA 


GGATTCCACT 


CATGAGCACA 


CTCTGGGCAA 


ACCAGTAGGG 


5580 


CACCGTCTTC 


GTAGACATAC 


TCTGAGTTAC 


ATTTTGGACA 


ATTTGGTAAA 


TTCTTCATGG 


5640 


TTTCTCCTTA 


TCATCATTCA 


CTVTTCTTTG 


AAAATCAAAA 




AGCAACTATT 


5700 


ATACCCTAAA 


ATCAGCATTT 


TGACAAATT7 


AGAAAAAAAC 


CGATATCAAT 


CTATCGCCTT 


5760 


TTCTACATTT 


ACATTCTTTT 


TTCAGCTTCT 


GCTTTGATTT 


TTTCAACTAC 


TTCTTGAATG 


5320 


TTCAAACCAG 


TTGTATCAAG 


GTAGACAGCA 


TCCTCTGCTT 


GTTTGAGAGG 


AGAAGTCTCA 


5880 


CGATGACTAT 


CCTTGTAGTC 


ACGCGCAGCA 


ATTTCCTTTT 


TTAGGGTTTC 


AAGGTCTGTT 


5940 


TCAATTCCCT 


TGGCAATATT 


TTCCTTGTAA 


CGACGCTCTG 


CTCTCTCATC 


AACAGAACCT 


6000 


ACTAGGAAAA 


TTTTCAATTC 


TGCTTGTGGC 


AATACAACAG 


TTCCAATATC 


GCGACCATCC 


6060 


ATGACAATCC 


CGCCTTGCTG 


GGCAATTTCT 


TGTTGGAGAG 


AAACCAGTTT 


CTCACGCACT 


6120 


TGAGGAATTG 


CTGCAATAGC 


AGAAACATGA 


TTGGTCACTT 


CATTTTCACG 


GATAGGATGG 


6180 


GTAATATCCA 


CATCTCCTAC 


AAAAACAAGC 


TGGTCTCCAG 


TTTCTGAACG 


TCCAAAGCTG 


6240 
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ATTGGATGCT GGTCCAACAA GGCTAGAAGG GCTTCGACTT CTTCAACTCC TAATTGGTTC 63 00 

TTAAGAGCCA TATAGGTCGC TGCACGATAC ATAGCTCCTC TATCAAGGTA GGTGAATCCA 63 60 

AAATCCTTAG CAATAATCTT TGCGACCGTA CTCTTACCGC TGGAAGCAGG ACCATCAATA 6420 

GCAATTTGAA TTGTTTTCAT ATCGGCTCCT ATTTTATTTT TATAACATCA CCTGGATTAG 64 80 

CAAACCAAGA TCCTGTAGCC ATGTGCCCAG GATTCAAGGC CTCTAACTGA GCAATGGAGA 6540 

TTCCTGCACG AGCGGCAATA GCTGCTTCCC CTTCTCCTGC GAGAACTTTA ATCGTTCCTT 6 600 

CAGGATTAGC AGCTTCTTCT GAACTACTAG AAGTAGATTC TGGCTCTGAA CTCTGCTCAG 6660 

GCTGAGAACT ACTTGAAGAT GAGATTTGTA CTACACTGGC ATCAGAATCA TGAAAGCCTT 6720 

TT AAGGCTG C TGTGCGATTA CTCCCCCCCG ATGATAGATA GATGAGAACG ATGACCATCA 6780 

CCACCACAAT TACAAAGAAA ATACTAGCTA GGATCGTCAA AATACGATTA GCCATCCTAT 6840 

CAGCCCCTCC GTGGTTTCGA TGCCGACGCT CTGCTCTTGA TTCTTCTTGA TCATAGATAT 6900 

CTTCTTGCCA CGGTTCTTTT GCCATACCTT ACTCCTTGTT TTTTTTTACT TTTCTTATTA 69 60 

CAATATAAAT ATGAACATGA AAATCACACT TATACCTGAA CGATGTATCG CCTCTGGCCT 7020 

TTGCCAAACT TATTCTGATT TATTTGATTA CCACGATAAT GGAATCGTGC GTTTTTACGA 7080 

TGACCCTGAC CAACTGGAAA AAGAAATTTC TCCTAGTCAG GATATCTTAG AGGCTGTTAA 7140 

AAATTGCCCA ACTCGCGCCC TGATTGGAAA CCAGGAAGCC TAAATCAATG GCGATAATCC 7200 

ACTCCCTCTA GTTTAGCACA TTTCCATGTA AAATTATAGT CTTTTCACTT TATTTTTTTC 72 60 

TGTAAAATCA GGAAGGTCAC TTTTTTCTTT GATAAGATAA AGTGGTCTTT TTTTAGTCTC 7320 

TAAATAAATC TTACTGATAT ACTTGCCGAG AATCCCAATG GTCAAGAGTT GAATGCCTCC 7 380 

AAGAAAGAGA ATAACAGCCA TCAGAGAGGT CCAACCAGAT GTCGGATTGC C C AAAATGAG 74 40 

GGTCCGAACC ACAACAAAAA AGGTCATCAG CAGAGAAAGA AAACAAGATA GGAGACCAGC 7 500 

TACAAAGGCT ATAATCAAGG GAAAATCTGA AAAATTAATA ATCCCTTCAA TGGAGTAGAA 7560 

AAAGAGTTGC CTAAAACTCC AACTTGTCTT GCCAGCCTGC CTTTCGACAT TTCGATAGTC 7 6 20 

CAAATAGTAG GTTTTGAAAC CCACCCAGGC GAAGAGCCCC TTTGAAAAAC GATTGCACTC 7 6 80 

GGTCAAGCTT AAAATGCCAT CGACTACAGA CCTTCTCATC ATACGAAAAT CACGGACACC 7740 

CGACGGCAGA GCTACTGGGC TGATTTTTTG CATGAGGCGA TAAAAGAGAA CAGCACAGAA 7800 

ACTGCGAAAG AAGGGTTCTC CCTCCCGACT AGTTCTCCGT GTCCCAACGC AGTCCAAGTC 7 860 

TACATTTTTG TCTAATACAT TTTTCATCTC AAACAAGATA CTAGGAGGAT CTTGGAGGTC 7 920 

TGCATCCATC ACCACCACCA AATCTCCTGT CGCATATTGC AAGCCTGCAT AAAGGGCTGC 7980 

TTCTTTGCCA AAATTTCGAG AGAAAGAAAT ATAATGGACT GCCGGATTTT GCTCCCGATA 8040 
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GGCCTTTAAG AGTTCCAAGG TCCCATCACT TGATCCATCA TCGACAAAGA CATACTCGAT 8100 

TTCTGTTTCC AAATCTGGAA GTAAAGCTTC CAGAGCCTGA TAAAAAAGAG GAAGTACTTC 8160 

CTCTTCGTTT AAACAAGGGA CGATGATTGA AATCATCATC TTAGTCTTCA AATCCATTTG 8220 

GATGCTTGCT TTGCCAACGC CATGCGTCTT CACACATTTG GGTGATGTCG AGTTCTGCTT 8280 

CCCAACCGAG TTCTGCTTTA GCTTTTGCCG GGTCTGAGTA GCAGGCAGCG ATATCACCTG 8340 

GGCGACGTTC TACGATGCGG TAAGGAATAG GACGGCCCAC CGCTTTTTCC ATCTTTTGGA 8400 

TAATTTCAAG AACTGAGTAA CCTTTACCAG TTCCAAGGTT ATAAACGTTT AGTCCTGAAC 8460 

C T TT TT GGAT TTTTTTCAAA GCTGCAACGT GACCCTTAGC CAAATCGACA ACGTGGATAT 8520 

AGTCACGAAC ACCTGTTCCA TCTTCCGTAT CGTAATCGTC TCCAAACACT TGCACTTGCT 8580 

CTAATTTTCC AACGGCTACT TGAGTCACAT ATGGCAAGAG ATTGTTTGGA ATACCGTTTG 8640 

GATTTTCTCC CAAATCACCA CTCTCATGGG CTCCGATTGG GTTAAAGTAA CGAAGCAAGA 8700 

CAACATTCCA TTCTGAGTCT GCTTTGTAAA TATCAGTCAA AATTTCCTCT AGCATCAGCT 8760 

TAGTACGACC GTATGGGTTG GTCACTGAAA GTGGGAAATC TTCCAAGATG GGCACTGTGT 8820 

GCGGATCCCC GTAAACTGTC GCAGAAGAAC TGAAGATGAT GTTTTTACAG TTGTTTTCTT 8880 

CCATGGCTTT CAAAAGGCTG ACAGTTCCAG CGATATTGTT GTCATAGTAG GCAAGAGGGA 8940 

TACGTGTTGA TTCGCCAACA GCCTTCAAAC CAGCAAAGTG AATGACACCA GTCGGTTCTT 9000 

CCTGCTTGAA AATATCTCTG AGGGTATCTG TGTCACGAAT ATCTGCCTCA TAGAAAGGAA 9060 

TCTCAACTCC TGTGATTCCT TCAACAACTT CTAAACTCTT ACGATTGCTA TTGACAAGAT 9120 

TATCCACCAC AACAACTTGA TGACCTGCTT GGATCAATTC AATAACAGTG TGGGTTCCAA 9180 

TAAAACCGGC ACCACCACTT ACCAAAATCT TTTCTTGCAT CTTTTTTCCT CC.aTTCTCAG 9240 

ATTATTTTTT CTTATTTTAC CATTTTTGAC AGGGAATGTC ATTTGCCATC CTAAACTACC 9300 

TGATAAAATT TCAGTAAAAT GCTTATACTC TTCGAAAATC CAATTCAAAC TACGTCAACG 9360 

TCGCCTTGCC ATGGGTATGG TTACTGACTT CGTCAGTTCT ATCCACAACC TCAAAACAGT 9420 

GTTTTGAGCT GACTTCGTCA GTTCTATCCA CAACCTCAAA GCAGTGCTTT GAGTAACCCG 9480 

CGGCTAGTTT CCTAGTTTGT TCTTTGATTT TTATTGAGTA TTATTCGCTT TTTACTCGTT 9540 

TGACATAGTT TTCAATTGGG TAATTTAGAG GGTCCAAGGT CAACTCCTTG TCTTGGATCA 9600 

GTTGGGCTAG ATGGTAACCA ATGATAGGAC CAGTTGTGAG GCCTGATGAA CCTAGTCCAC 9660 

TGGCTGCATA GACACCAGTT AAGTCAGGCA CCTGCCCAAA GAAAGGAGAG AAATCACTGG 9720 

TGTAGGCACG GATTCCAACA CGCTCAGATT TTGAAGTAGC TTCAGCCAAA ATCAGATAGT 9780 
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AACCAGTCAT 


TGAGCCCATT 


TTCAACGATA 


TCAATATTGC 


TATGGCTGAC 


ATAAACTGCG 


11160 


ATATCATCCT 


TAATCAGGTC 


GATGTAAATC 


TGATTTTGCG 


GACGGCTGGC 


AAGCAAGTCC 


11220 


TTGATAGGAC 


GAAAGATAGG 


CGCGTGCTTG 


ACGATAATCA 


AGTCCACACC 


CTTTTCAATG 


11280 


GCCTCTGCCA 


CTGTCTCTTC 


ACGAATATCG 


AGGGCAACCA 


TGACCCTTTG 


GATACCCTTG 


11340 


TCTAAAGTGC 


CAATTTGCAG 


ACCACGGCTG 


TCTCCCTCCA 


TAG AAAA TTC 


CTGAGGGCAA 


11400 


AAGGCTTCAT 


AAGCTTGGAT 


CACTTCACTT 


GCTAACATGG 


AGCACCTCCT 


TGATAGCTTG 


11460 


AATCTTATCT 


ACTAGAACTT 


GACGTTCTTC 


CAGATTTTTT 


TCTGGGATTT 


GTCCGAGGGC 


11520 


GAACTCTAGC 


TTCTCAGCTT 


CTTTTTGCCA 


TTTTTGGACA 


AATACTGGAC 


TGACTTCTTT 


11580 
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GGACAAGAAG GGACCAAAGC GAACATCACT GGCTGATAGC TTCATTTGTC CTGCTTCCAC 11640 

CACCAAAATC TCATAAAACT TTCCAGCTTC TTCTAAGATG CTTTCTGCTA CAATCTGGAA 11700 

TCCATGATCC TGTAGCCAGA TACGCAAGTC GTCTTCACGA TTATTGGGCT GGAGGATCAA 11760 

ACGCTCTACA TTAGCTAACT TCCCCAAACC TTCTTCTAAA ATCCTAGCAA TCAAACGACC 11820 

ACCCATGCCA GCAATGGTAA TGACAGACAC TTGGTCAGTC TCTTCAAAAG CTGCCAAGCC 11880 

ATTGGCTAAA CGGACTTGGA TTTTCTCCTT TAGGCCGTGA GCCTCAACAT TTTTAACCGC 11940 

AGACTGATAG GGACCTTCCA CCACCTCACC TGCAATAGCG CTTTTGATTT GGCCTCTCTC 12000 

AACCAACTCG ATAGGCAGAT AAGCATGGTC ACTTCCCACA TCTAGTAAAA TAGCCCCCTG 12060 

TGACACAAAG GAAGCTACCA ATTCTAATCT CTTTGAAATC ATCTTCTCTC ACTTTCCAAA 12120 

ACTCTATTAC CTCTTATTAT ACCACATTTC AATCTTCAAC TTCCCAGTAA TATAAGCACC 12180 

TCTGGCGAAA GAAGTTTCAA TGTCCTAAAG TAATAAGTGA ATCCAATTGA AAGATTTTAA 12240 

ACAATTTGCA AAAATGTCAA AAAATAAAAA ATAAACAGTT TATTCAGAAA ATTCTTGACA 12300 

TATAAAAACA CATGGTAGAA TATAATTAGA AAGTTAGAAA AAATAAAAGT TTCACTAAAA 12 360 

TTTGTATTTG AAGGTGGTGT TCAGATAAGA AATTTAGTCA GACGAACCAC GAATTTGCTC 12420 

TATGCTTTCT GGAATTTATC ATAACAGGAG GATACAGTCA TGGAACAAAC ATTGTTTGAA 124 80 

TTAGAACTAC TTCCAGACGA AGATATCATT GTCACAGGTC TCCCTAAGTA TTGTTCTTTT 12540 

ACTTGTTTAA TTACAGGTCG CTAGTTATAT TTTATATAAA ATAAGTAGCT TTACTTACGG 12600 

AATAGGCTAG TCCTCTGTCT CTAGCCTATT TTAATAATTA GGAGTTTGTT ATGGATTTAT 12660 

TAGAGAAAGA ATGTTTAAAA TGTGATAAAA ATTTCCAACA GGGTGATATT TGCAATTACT 12720 

ATTATTTATC AGATAAGATG CCTGCACAAG GGTGGAAAAT ACACATAAGC TCCCAAATAA 12780 

AAGACGCTGT AAATATTTTT AAGATTGTGT ATAAACTATC CCAACTAAAT AATTGTAGCT 12B40 

TTAAAGTTGT TAAAAATTTA GAGGAATTAA AAAAAATTAA TTCCCCTACG GAAATGAGCC 12 900 

CTACTCCTAA CAAATTTATA ACTCTATATC CTAAGTCAGA ATCTGAAGCT AAGAGTATGA 12960 

TTTGTAATCT TACGAATAGA CTGTCAGAAT TTAAGGCTCC AAAAATACTA TCTGACTATC 13020 

AATGTGGAAT GCATTCTCCA GTTCATTATA GATATGCGCC TTTTTTAAAA AAACAAGCTT 13080 

ATGATGAAAA AAATAAAAAA GTCATCTATT TATTGCTAGA TGAAAAAAGG AAGAACTATG 13140 

TAGAAGATAA GAGACAAAAT TTCCCTAGTC TTCCTAGCTG CAAAATGGAT TTATTTTCAG 13200 

AAGAAG 13206 
(2) INFORMATION FOR SEQ ID NO: 34: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 13104 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

CCGGATCCAG CGAAAAATAT GCTCTTTGAT GCTGTAAGTG GTCAAAAAGA TGCTAAAACA 60 

GCTGCTAACG ATGCTGTAAC ATTGATCAAA GAAACAATCA AACAAAAATT TGGTGAATAA 120 

AAAATTTGTT CAAGGGGGGT GGAAATCAAA TCCCCCTTTG AATTTATCAA TAGAGACACA 180 

AATAATTTAG CTTTCTTATA AAAAAGTAGT ATCCTATGAA AGGAGTTAAT ATGGAAAAGC 240 

AACAACCTAG TAAAGCAGCC CTGCTGTCTA TCATTCCTGG GTTAGGACAG ATTTACAATA 300 

AACAAAAAGC CAAAGGTTTT ATCTTCCTTG GTGTAACCAT CGTATTTGTC CTTTACTTCC 360 

TAGCACTTGC AACCCCTGAA TTGAGCAACC TCATCACTCT TGGTGACAAA CCAGGTCCTC 420 

ATAATTCCCT CTTTATGCTG ATTCGTGGTG CCTTCCATCT AATCTTTGTA ATCGTTTATG 480 

TACTCTTTTA TTTCTCAAAT ATCAAAGATG CACATACGAT TGCAAAACGC ATTAACAATC 540 

GAATTCCAGT TCCACGCACA CTCAAAGACA TGATCAAAGC GATTTATGAA AATGGCTTCC 600 

CTTACCTCTT GATCATTCCA TCTTATGTTG CCATGACCTT CGCGATTATC TTCCCAGTTA 660 

TCGTAACCTT GATGATCGCC TTTACCAACT ACGACTTCCA ACACTTGCCA CCAAACAAGT 720 

TGTTGGACTG GGTTGGTTTG ACCAACTTTA CAAACATTTG GAGCTTGAGT ACCTTCCGTT 780 

CTGCCTTTGG TTCTGTTCTT TCTTGGACTA TCATTTGGCC TTTGGCAGCT TCTACTTTAC 840 

AAATCGTAAT TGGTATCTTC ACAGCTATCA TTGCCAACCA ACCATTTATC AAAGGAAAAC 900 

GTATCTTTGC TGTTATTTTC CTTCTTCCTT GGGCTGTCCC AGCCTTCATC ACTATCTTGA 960 

CATTCTCAAA CATGTTTAAC GATAGTGTCG GTGCTATCAA CACTCAAGTA TTGCCAATCT 1020 

TGGCTAAATT CCTTCCTTTC CTTGATGGAG CTCTTATTCC TTGGAAAACA GACCCAACTT 1080 

GGACTAAGAT TGCCTTGATT ATGATGCAAG GTTGGCTCGG ATTCCCATAC ATCTACGTTC 1140 

TGACCTTGGG TATCTTGCAA TCTATTCCTA ACGACCTTTA CGAAGCACCT TATATTGACG 1200 

GTGCCAACGC TTGGCAAAAA TTCCGCAACA TCACTTTCCC AATGATTTTG GCTGTTGCGG 1260 

CACCTACTTT GATTAGCCAA TACACCTTCA ACTTTAACAA CTTCTCTATC ATGTACCTCT 1320 

TCAATGGTGG AGGACCTGGT AGTGTCGGAG GTGGAGCTGG TTCAACCGAT ATCTTGATCT 1380 

CATGGATCTA CCGTTTGACA ACAGGTACAT CTCCTCAATA CTCAATGGCG GCAGCTGTTA 144 0 

CCTTGATTAT CTCTATCATT GTCATCTCAA TCTCTATGAT CGCATTCAAG AAACTACACG 1500 
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CATTTGATAT GGAGGACGTC TAAGATGAAT AACTCAATTA AACTCAAACG TAGACTGACT 1560 

CAAAGCCTTA CTTACCTTTA CCTGATTGGT CTATCAATTG TAATTATCTA TCCACTGTTG 1620 

ATTACCATTA TGTCAGCCTT TAAAGCAGGT AACGTCTCAG CCTTTAAACT AGATACTAAT 1680 

ATCGACCTCA ATTTTGATAA CTTTAAAGGC CTCTTCACTG AAACCTTGTA CGGTACTTGG 1740 

TACCTCAACA CTTTGATTAT CGCCTTAATT ACCATGGCTG TTCAAACAAG TATCATCGTA 1800 

CTTGCTGGTT ATGCTTACAG CCGTTACAAC TTCTTGGCTC GTAAACAAAG TTTGGTCTTC 1860 

TTCTTGATCA TCCAAATGGT GCCAACTATG GCCGCTTTGA CAGCCTTCTT CGTTATGGCG 1920 

CTTATGTTGA ACGCCCTTAA CCACAACTGG TTCCTCATCT TCCTCTACGT TCGTGGTGGT 1980 

ATCCCGATGA ATGCTTGGCT CATGAAAGGC TACTTCGATA CAGTGCCAAT GTCTTTAGAC 2040 

GAATCTGCAA AACTAGACGG TGCAGGACAC TTCCGCCGCT TCTGGCAAAT TGTTCTACCA 2100 

CTTGTTCGCC CAATGGTTGC CGTACAAGCT CTCTGGGCCT TCATGGGACC TTTCGGGGAC 2160 

TACATCCTCT CTAGTTTCTT GCTTCGTGAG AAAGAATACT TTACTGTTGC CCTACGTCTC 2 220 

CAAACCTTCG TTAACAATGC GAAAAACTTG AAGATTGCCT ACTTCTCAGC AGGTGCTATC 2 280 

CTCATCGCCC TTCCAATCTG TATTCTCTTC TTCTTCCTAC AAAAGAACTT TGTTTCAGGA 234 0 

CTTACAAGTG GTGGCGACAA GGGATAATTT ATCCCCGCCA CCCTTTTTCA TTTTATACTC 2400 

TTCGAAAATC TCTTCAAACC ACGTCAGCTT TATCTCCAAC CTCAAAGTTG TGCTTTGAGC 2460 

AACCTGTGGC TAGTTTGCAC TTTGATTTTC ATTGATTATT AGC AATTGTC ACTGTAAATA 2S20 

ATATCCTTGT AGCAAGCAAT TTTTCTCCTA GACTTGAAAT AAAGCGCATT TCTCTATATA 2580 

ATAATACTCA TATAGAAAAC ACCTTTTAGA AAGATACCTA TGCTTCCATA TCCATTTTCC 2 640 

T A7TTTTC AA GTATT7CGGC GGTTCGTAAG CCCCTGTCCA AACGTTTCGA GCTCAACTGG 2700 

TTTCAACTTC TCTTTACCAG TATCTTCCTT ATCAGCTTGT CTATGCTACC CATTGCTATC 2760 

CAAAACAGCT CCCAGGAGAC CTATCCGCTA GAAACTTTTA TCGATAATGT CTATGAACCT 2820 

CTGACAGATA AGGTTGTCCA GGATCTCTCT GAACATGCTA CAATTGTCGA TGGCACATTA 2880 

ACTTATACTG GAACAGCTAG TCAAGCCCCT TCTGTTGTGA TTGGTCCAAG TCAAATCAAG 2 940 

GAATTACCTA AGGACTTGCA ACTGCATTTC GATACAAATG AGCTAGTCAT CAGCAAGGAA 3 000 

AGCAAGGAAC TGACCCGCAT CTCTTACCGA GCCATTCAGA CTGAGAGTTT CAAAAGCAAA 3060 

GACAGCTTGA CCCAAGCAAT TTCTAAACAC TGGTACCAAC AAAATCGTGT CTATATCAGC 3120 

CTCTTCCTAG TTCTCGGTGC GAGCTTCCTC TTTGGTTTGA ATTTCTTTAT CGTCTCTCTT 3180 

GGAGCTAGCT TTCTCCTTTA TATCACCAAA AGATCACGCC TCTTTTCATT TAATACCTTT 324 0 
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TTTCCAACCA AAGGTCAAAC CTGCTATCAG CATGATAGTT CCATTTACCA AGAAAGAAAT 5100 

ACTACCGACA TTCTTACCCG TTTTCTTACG AATAGTCAGG CTGACGATAT CCGTCCCACC 5160 

ACTGGACATA TTGTTTCGAA GAGCAAAACC AATCCCCAAA CCCATAACAA CACCCCCAAA 5220 

AAGGGAATTG ATAATCGGAT CCTCTGTCAA GGTTGCCACA GGGACAAACT GGATAAAGAA 52 80 

GGAACTCATA GATACCGTGA TAAAGGTAAA GACGGTGAAC TTATGGCCAA TCTGATACCA 5340 

AGCTAAGACC ATCAAAGGGA AGTTAATGGC GTAGAAGCTT AGCGAAATCG GAATATGAAA 5400 

ACCAAACCAG TGATTACTCA AGGCAGAGAT AATCTGTGCC AGACCTGTTG CACCACTCGA 5460 

ATACACATGC CCTGGTTGGA AAAAGAAATT AACTGCTACT GCTGATAAAA AACCATAGAC 5520 

CAGAGAGGCC GAAATCTTCT CATCATACTT TTCTCGAGAG ATACTTTGTA AGACACGTAA 5580 

AATTTTTATC TGATAAGCAA AGCGGCGCAG ATAATAGCGC CACCGCTTAA TTCGTTTTGT 5640 

TTGTTTCATC TTCTTCTACT TGTAAGCTGA GTTCCTCTAG TTGTTTGAGA GCGACTGTTG 5700 

ATGGAGCTTG TGTCATTGGG TCAGTTGCCT TGTTGTTCTT AGGAAAGGCA ATGACTTCAC 5760 

GGATATTTTC TTCTCCAGCA AGCAACATGA CAAAACGGTC AAGCCCGATA GCCAAACCAC 5820 

CGTGTGGTGG GAAACCATAG TCCATGGCTT CAAGAAGGAA ACCAAACTGG TCATTGCCTT 5880 

CTTCAGTTGA GAAACCAAGA GCCTTGAACA TGCGTTCTTG AAGGTCTTTT TGGTTGATAC 5940 

GAAGGCTACC ACCACCAAGC TCATAACCGT TCAAGACGAT ATCGTAAGCA ATGGCACGAA 6000 

CCTTAGCCAA ATCACCTTCT AATTCATGAG CAGTCTCTTC CTGTGGAAGT GTGAAAGGAT 6060 

GGTGGGCGCT CATGTAGCGG CCTTCTTCTT CAGACCATTC AAACATCCGC CAGTCAACCA 6120 

CCCAAAGGAA GTTGAACTTA TCATTATCAA TCAAGCCAAG CTCTTTAGCA ATACGTCCAC 6180 

GAAGGGCACC CAGTGTTGCA TTAGCCACTT CAAGCGTATC CGCCACAAAG AGAACCAAGT 6240 

CCTTATCTTC AAGAACAAGC GCTGTTGTCA ATTCTTCTTG GATACCAGTC AAGAACTTGG 6 300 

CAACTGGTCC GTTTAATTCT CCATCAACCA CCTTGACCCA AGCAAGACCT TTGGCACCAT 6360 

ACTGTTTGGC TACTTCCGTC ATCTTGTCGA TGTCTTTACG TGAATAGTTG TCCGCAGCTC 6420 

CTGTGACCAC AATCGCTTTT ACAGCAGGTG CTTCTGAAAA GACTTTAAAG TCTACACCTC 64 80 

GGACCACTTC TGTCAAGTCC TGAAGCAACA TGTCAAAACG AGTATCTGGC TTGTCAGAAC 6540 

CGTAAAGAGC CATAGCATCA TCGTATTTCA TACGAGGGAA TGGTAGCGTT ACTTCGATGC 6600 

CTTTTGTTTC CTTCATCACG CGCGCGATCA AGCTTTCTGT AATATCTTGG ATTTCTTGCT 6660 

CAGTAAGGAA GGACGTTTCC AAGTCGACCT GAGTAAATTC AGCCTGGCGG TCTCCACGCA 67 20 

AGTCCTCCTC ACGGAAACAT TTAACGATTT GGTAGTAACG GTCAAAACCA GCATTCATCA 6780 
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AGAGCTGTTT CGTGATTTGT GGACTTTGAG GAAGAGCGTA AAAATGCCCC TTATTAACAC 6840 

GAGACGGCAC TAAATAATCA CGCGCCCCTT CAGGCGTTGA CTTAGAAAGG AATGGTGTCT 6900 

CCACGTCGAT AAACTCCAAC TCATCCAAGT AGTTGCGGAT AGAGTGGGTC ACCTTGGCAC 6960 

GAAGTTTAAG ATTTTCCAAC ATTTCTGGAC GACGAAGGTC AAGGTAACGG TAACGCAAAC 7020 

GTCTATCGTC ATTTGCCTCA ATGCCATCCT TAATCTCAAA TGGTGTTGTC TTAGCTGTGT 7080 

TAAGCACAAT AAGAGCTGTC ACGTTTAACT CAACCGCACC AGTTGGCAAC TTATCATTGG 7140 

CTTGTCACGC GCAGCGACCT GACCAGTCAC CTCAATAACA AATTCGCTAC GAAGGcTTTC 7200 

AGCTGTTGCC ATAACCTCTG CAGATACTTT TTCAGGGTTG ATAACCAACT GCATGATTCC 7260 

TTCACGGTCA CGAAGATCGA TAAAGATCAA ACCACCAAGG TCACGACGAC GGCCAACCCA 7320 

TCCTTTCAAG GTTATTTCTT CTCCCATCTG TTCCTCACGA ACACGACCAG CATACATACT 7380 

ACGTTTCATT ATTTCTCTCC TCTTTTATTC TGTTACTATT TTACCATAAA AGCGCAGCTC 744 0 

TTCATGAAAA TCATCAGAAA AGTTTGCCAG TCTTTAAAAG TCAGGTGAAA GCCCTAAAAA 7 500 

TTAGCGCTAA TACTCTTCGA AAATCTCTTC AAACCACGTC AGCGTCGCCT TACCGTATGT 7560 

ATGGTTACTG ACTTCGTCAG TTTCATCTAC AACCTCAAAA CCATGTTTTG AGCTGACTTC 7620 

GTCAGTTCTA TCCACAACCT CAAAACAGTG TTTTGAGCAA CCTGCGGCTA GCTTCCTAGT 7660 

TTGCTCTTTG ATTTTCATTG AGTATAATAC AAAAATCCGA TGAACTTCAC CGGACTCTTT 7740 

TATTTTGAAT TTTTGCCTGC TTTACGCTTT TCAGCGATTT CGGCTGCCTT TCGAGGCAAG 7800 

ACAATTTCCG TTATGTAACC CGTCCCAAAA CGCACTACAC CTGCAATACG AGCAAAGACA 7860 

ACTGCTAGAT AGTTATAGAA GAAATCGCCT TTGAAGGCAT AACCTACCGC TCCAATGATG 7920 

AAAAATAGAA CGACTGCCTG AATCACTGCT AATAAAATTA CTCGTTTCAT GTGACCTCCT 7980 

GACTCTATTA TAGCATGAGA ATCATCAAAA AGCCGACTAA ATTATTCAAA GCGTGAAGAG 8040 

AAATACTGTA GACCAGACCT TTTCTGCTAA TGTAAGCCAA ACCCAAACTA AAACCAAGGC 8100 

TAAAATAGAC AAAAAATTGT TGCACATCAC CTGGAAAATG AATCAAGGCA AATAGAAGAC 8160 

TAGATACCAG AAGAAAAATC AGGGTTCGTT TACTATTGTC CTGCTTAGGA AAGAGATAGC 32 20 

GTGCTAACAT CCCTCTAAAA ACAATCTCTT CCGTCAAAGG AGCAAAAATA ACCACAGCAA 8280 

AGAATGAGAA AAGTGGTTGA GACAAGGTCA AGTCTGTCGC TATTTGCTGA TTTACTGAAG 8340 

GATCATCTGG CAAGAAGAAT TGAACGACCA GAGATAAGAA CCAAACCAAG ACAGGAAGCC 8400 

AAATAAATCG ATTAAAGCCG CTCTTCTCAA TATGAACAGG AGCCTTCTGA TACCATTTGT 8460 

AAATGCCGTA CACATATACT CCAGCCAAGG CCACATAGAG TAGAGTAACA GCATAGGGTG 8 520 

AAGCGCCTAA AGCAAGCGAC GCAGTCGCGA GCCCCTGAAT AAAGCCATAG ATAAATAAAA 8580 
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AGGATAGAAG GGCTAGAAGA ATCCAGCCAA GGTTTTTAAG TAATTTCATA GATAACTCCT 8640 

TTATTTGAAA TAACGTTTTA CCATAGGTAA CTGCATCACA TTGATATAAA CATGGATGGC 8700 

TCCTACAAGC AAGAAAGCTA GTAACTGAAT CTCTCCTGTC AAGAAAGAAA TGATAATAAG 8760 

AAAAATATAT AAGGCTGGTA AGACATATTG GTGTAATTGG AATAAAATTC GAAAACTCTG 8820 

TTCCAAATTA GCCTGACGCT CCCCTTCATC ATAAGAATTT ATATAGTTCA AGACATCCTT 8880 

TGGTGTAGCG AAAAATTCCA AATCAAACTG ACGAACAATC GCAATGGTTT TAAAAAGAGA 8940 

TTTTTGAGCG ACTAAGAATA CCACAAAGAG TAAGAAAGAA AGGAAAAATG TTTCAGCGTT 9000 

TGTATGCAAT ATAATCACCT CACTTAATGA AATAAAAATA GCCAATGGAA TCGCTACACC 9060 

TGTAATATTA AAAGCAATGG TTCCAAACTC AAGATTCCGA TACATTTGCA CATAATAGGT 9120 

TTCATTCAGA TCGTCATCCA TTTCCTCTTG ATACAAAGAA TGAAATTTTC TGCTTTTCTT 9180 

TAAGAAATTG AAAGTCAAAA ACATACTAAT GAAACCTATC AGTAAACAAA TAGCTGATAT 9240 

CCATGGCATC AAGGCTTTTA CATCTAAAAT AATTTCGTGG GATTCGACAC GTGCCTTAAA 9300 

CATCCCTACA AACATGCCCA AGAACCCCCC AAGACAATAG ACATCAAAAA TAACAATCTA 93 60 

CGTTTCTTTT TCATATTCAT TCTCCTTTTT CACTTGCTAG ATTTTTGGAT TTCTTTTCAA 9420 

TCCATTCAAT TACTGGGATG AGAGCAAAGT AGACCCAAAC AAATTGGTCG CTTTGATAGG 9480 

GATTAAACCA GCTTAGGTCC ATCCCAATCA GTAGAAATAC GCTGACTAAT AAAGCTATGA 9540 

CCACTACATA ATAAATCACT TTATACTTGT TCATCACTCG TCCTCCTCCA AACGAAATAC 9 600 

CGATTCGACT GTTTCGTTGA AAATTTGAGA TATTTTCAGG GCAATGATAA TGGATGGGGT 9660 

GTACTCATCC CGTTCTAGTA GGCTAATGGT CTGTCTCGAA ACCCCTGCCA GTTTGCCTAG 9720 

GTCCCTTTGA TTGAGACCAT CGCGAGCTCG AAGCTCTTTT AGACGATTTT TTAGTTGCAT 9780 

GTTACACACC TACTCTCCGT CAAATTCAAC GGTTTGGATA TCCTCAATAC GTTGCAACTT 9840 

GAATTTTTCT TTTCCCGTAT TATCTACACG TCGTAGCTTT ACCCATTCCT CATCAACATC 9900 

CACAACTTCC CAGTTATCTC GCCCAATATA CACTCCCGTT ATAATTGGTT CCTTTCCAAT 9960 

CATTTCTTGT AATAATCTCG ACATTTCTGC GTTTCCTTTC TCTTTTCGCT CAAGTCTTTT 10020 

GATTTTATTC TCTACTTTCT TGATTTTTTT AGAATTATTA GAATAAAAGA AAATCATAAA 10080 

TAGTATAAAT CCTAGTACCC ACATTATAAC TCCTTTCTGC TTCCTATTTC TTAACTTGAA 10140 

TTCATTGTAA CATATCTTTT TCTTTTTGAC AAGTATAGTT GTCAAAAAAA TTATGATTTT 10200 

TGTCATTTTG CAAAAGAAAA AGGTCAGGAG TACCTTCCTG ACCACTTTAT CTATCATTAA 10260 

TACTCTTCTA AAATCTCTTC AAACCACGTC AGCTTCACCT TGCCGTAGGT ATGCTTACTC 10320 
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ACTTCCTCAG TTTCATCTAC AACCTCAAAA CCATGTTTTG AGCTGACTTC GTCAGTTCTA 10380 

TCCACAACCT CAAAACCATG TTTTGAGCTG ACTTCGTCAG TTCTATCCAC AACCTCAAAA 10440 

CCATGTTTTG AGCTGACTTC GTCAGTTCTA TCCACAACCT CAAAACAGTG TTTTGAGCAA 10500 

CCTGCGGCTA GCTTCCTAGT TTGCTCTTTG ATTTTTATTG AGTATAAAAT CCTAGTTTTT 10560 

CAAAGATTTC TGAGAAGTTT TGGCTGATTG TCTCAAGTGA CACTTGCACT TCTTCTCGGG 10620 

TTTGGTTGTT CTTGACCGTC ACTTGTCCGC TTTCGACTTC GCTCTCTCCT AGGGTGATGA 10680 

GGGTCTTAGC CGCAAAGACA TCGGCTGACT TGAACTGAGC TTTTAGTTTA CGGTTGAGGT 10740 

AATCACGCTC TGCTTTGAAA CCTTGTTGGC GAAGAGCCTG TACCAATTCC AAGGCCTTGA 10800 

TATTTGCCCC TTCGCCCAAG ACTGCGATAT AGACATCTAG GGCGTTTTCG ATAGGGAGGG 10860 

TCACACCTTG CTTTTCAAGG ATGAGAAGCA GGCGCTCTAC ACCAAGTCCA AAACCAAATC 10920 

CAGCAGTTTC AGGGCCTCCA AAGTAAGCAA CCAAACCATC GTAGCGACCA CCCGCACAGA 10980 

CGGTCAGGTC ATTGCCCTCA ATCTCTGTGA TAAACTCGAA AATGGTGTGG TTGTAGTAGT 11040 

CCAGACCACG CACCATATTG GTATCGATGA TGTAATCTAC TCCAAGATTT TCCAACATCT 11100 

GACGCACAGC ATCAAAATGA GCTTGGCTTT CTTCATCAAG AAAGTCCAAG ATAGACGGCG 11160 

CATTCTCTAC TGCCACCTTG TCTTCTTTTT CCTTAGAGTC CAAGACACGA AGAGGATTTT 11220 

CCTCCAAGCG ACGTTGGCTA TCCTTAGACA AGGTCTCCTT GAGCGGTGTC AAATAGTCAA 11280 

TCAAGGCTTG GCGGTAGGCT GCACCGCTCT CAGGATTTCC AAGAGTGTTG AGGTGCAATT 11340 

TGACACCTTG AATACCGATT TCCTTCAAAA AATGGGCTGC CATAGCGATT GTTTCCACAT 11400 

CGGTAGCTGG ATTGCTAGAG CCAAAACACT CAACACCAAT CTGGTGGAAT TGGCGCAAGC 11460 

GCCCTGCCTG TGCACGCTCA TAACGGAACA TAGGTCCCAT GTAGTAGAAC TTGCTTGGCT 11520 

TTTGCACTTC TGGGGCGAAA AGTTTATTTT CC ACATAGGA ACGGACAACG GGTGCAGTTC 11580 

CTTCTGGACG GAGGGTAATA TGACGGTCAC CCTTGTCATA AAAATCGTAC ATTTCCTTGG 11640 

TTACGATATC CGTTGTATCT CCGACAGAGC GACTGATAAC CTCGTAATGC TCAAAAATAG 11700 

GCGTGCGCAC TTCTGCATAG TTGTAGCGTT TGAAAATCTC ACGGGCAAAG CCCTCAACGT 117 60 

ACTGCCACTT AGCAGACTCA GCAGGTAAAA TATCCTGCGT TCCTTTTGGT TTTTGTAATT 11820 

TCATAGGGAA TCCTCTTTAA ACTTAATAGT CTTATTTTAC CATAAATAGA GGGATTAAAA 11880 

CAGTAAGAAA AAAATTAGGA TTTAGATATC ATTTTTGAGA TTAAGAATTG TCAAAAAAAT 11940 

AGCTAGCAAG GAAAGACCAA CAAATAGCAT CCAAGTCAAC TGTATATTCC ATACGGCTAC 12000 

TAGTGAAAAA CAAGCTGTTC CCACAGGTAT GGATAAGGTA AACAATAGAC CTAAAAAATT 12060 

ACTAGTACGA GCTAGAACCT CTGGAGCTAG ATTTTTCATG AGCATGGCAC TAATCTTTGG 12120 
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TTGAACTTTA CCAGACACAT ACAGAGTAAA GAAGAGAAAT AGCAAACCAA GCACGACTTG 12180 

ATTGAATAAA TTAGCCAAAC CAACTAGACT AAGTCCTACG GTCTCCCACA TCATCAATCT 12240 

AGGCAAGGAC TGCTTCCCAA AATAATCATT GCCCGTAAGG CTACTGATGA TGACTGATAC 12300 

TAAAACACAG AATTGATTGA TAAATAGTGC CTCTGTATAA GAAAAATTCA AGAGAGAATG 12360 

GCTCAAAAAG AAGATATTAT AAATTCCACC CAAAGCGCCA CCCAAGGAAT TAATAAGCAA 12420 

GACAGCAAAG AGCATAAAAC CAAAGTTTTT CTGTCCACTT TTAAGAAAAA CGAGACGTAA 12480 

ATTTCGGTAA ATTGTTAGGA ACTGGTCTTT GATAGAAAGC TTCTCATTTT TTAAGTTTTC 12540 

ACCATCAGCA GATGACATTG ACAGGCTCAA TTTGCTTTTT CCTAAAAAGA GGATAGTGGC 12600 

TGATACTAGG AAAAAGCAGG CATTGATTCC CGCAACGAGA GAAAAATTGT TGACCGATAG 12660 

AGCTAAGAGC CAGACTCCGA AAGCTTGACC ACCAATAGCT GAAATATAGG TGATGAACTG 12720 

TGAAAAAGAA TAACCCTCCA TCAGATCATC TTCAGCTACT TTTTCCTTAA TAAGAGGCAT 12780 

ACGCAGGCCA CCTGCAAAAT CACTGATGAT ATCACTAATG ACATTGATCA AACACAGGCT 12840 

AGAAAAGGCA AAGAGACTAG CTTGCTGAAC AACTAGGGCT GCTAGAAAAA ATAGAACCCC 12900 

CTGAAACAAA CCGCTATAGA CCATCCATTT GACCTTGTCC CTCGTGTAAT CTGCCCGAAT 12960 

CCCTGCAAAA ACTGTAAAGA GGGTCGGAAG AATCATGACA ATATTCGCCA TAGCAACAGC 13020 

AAAAGATGCT TGTGACAAGG TCGATGCATA GACGATAAAG ACCAGGTTGA AAATCGAAAC 13080 

ACCAAAAGCA TTGAAGAAGC GTGG 13104 

(2) INFORMATION FOR SEQ ID NO: 35: 

<i) SEQUENCE CHARACTERISTICS: 

(A* LENGTH: 19250 base pairs 
(Bl TYPE : nucleic acid 
(CJ STRANDEDNES5 : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CCGGGCAAAT AGTTTTGAAC TTTTCATCAT TTTCTCCTTT AAAACTTTCT CTCCATTATA 60 

GACTCTTTTC AGAAAGTTGT CAACAGAATT TTCAGAATTT TTGAAAATTA TTTTTCAAAC 120 

AACATCTTTG CAAAAAATAT GAATATCGTA AGCGCGTCAT AACAAGGTAT CTATCATTCA 180 

TGGAGCTCCT CCTGTATACT ATTAGTAAAG TAAATATTGG AGGATATTTT AATGCCACAA 240 

CCTATTGTTC CTGTAGAGAT TCCACAATCT CGTCGTTTTG ATTCTAAAAA GAGAAATGAT 300 

ATTCTrCTTA AAATTCGTAT TGGCAAGCTT GAAGTAAGTT TTTTTCAATC TCTCAATCTC 360 
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GAAATGATAG 


AACAGCTTTT 


GGATAAGGTG 


TTGCTCTATG 


ACAATTCATC 


TATCTAGCCT 


420 


AGGGCAGGTC 


TATCTCGTGT 


GTGGGAAAAC 


TGATATGAGA 


CAAGGAATCG 


ATTCACTGGC 


480 


TTATCTCGTT 


AAAACCCACT 


TTGAATTGGA 


TCCTTTCTCC 


GGTCAAATCT 


TTCTCTTTTG 


540 


TGGTGGACGT 


AAAGACCGCT 


TTAAAGTCCT 


TTACTGGGAT 


GGTCAAGGAT 


TTTGGCTACT 


600 


ATATAAACGC 


TTTGAGAACG 


GCAGACTGAC 


TTGGCCCAGT 


ACAGAAAAGG 


ATGTCAAAGC 


660 


TCTCGCACCT 


GAACAAGTAG 


ATTGGCTGAT 


GAAAGGCTTT 


TCTATCACTC 


CAAAAATATA 


720 


GTAGATTGAA 


ACTAGAATAG 


TACACCTCTG 


CTTCTAAAAC 


ATTGTTAGAA 


ATCGATTTTA 


780 


PTfn*r*PTG AT 


CGATTTGTCC 


TGTTATTATT 


TCATTTTACT 


ATAAATCCAT 


CAGAAAGTCG 


840 


i Un A A A \> A a 


TGAAATGAGG 


ACT^TTCTTTT 


TATACTCATC 


TGCTTTCAAA 


AAGCACTCTA 


900 




GATTAACGAT 


GGACTTTATC 


ACCTCCTTCT 


CCAGTCCTTG 


TATAACATCT 


960 


Tft A AGTTGAT 


TPATGAPATC* 


TTCCAAAGTT 


CGAAAGGCTT 


TATTCTTAAA 


TCCACGTTTA 


1020 






TTrftATCCGG 


TTCATCTCTG 


GTGTGTATGG 


AGGAATAAAT 


1080 


nr*fc a ik.cic*c a a 


T ATT AGTPGG 


AATCTTTAAG 


GTACTTGATT 


TATGCCATAT 


AGCATTGTCC 


1140 




A A An ATA ATP 


ATTTGGATAA 


GCTTGTGAAA 


GCTCCTATTC 


CTAAAGCCCC 


1200 


rt** I u p Art X ft f s ^" > ^P 


PTTGPGAGAG 


ACACTATTGA 


CTCAGCCCTT 


ACTTCATGCG 


GATGAAACCT 


1260 


OfTATOGGGT 

* 1 (>Uuu X 


TCT AG AG AGT 


GATAGCCATC 


TGACCTACTA 


TTGGACTTTT 


TTGTCAGGTA 


1320 


AAGPAGAGAA 


APAAGGGATT 


ACGCTTTACC 


ACCATGATCA 


GTGTCGAAGT 


GGTTCAGTAG 


1380 


TAPAAGAATT 


PPTAP.P.AGAT 


A n A A A \#V7%# 1 


ATGTTCATTG 


TGATATGTTG 


CGGCAGTAAC 


1440 




AGTPPTPTAG 




GCGATAGCAG 


TCCAAGGTTT 


AGGAGTAAGG 


1500 


CGACGCTAAG 


PTTGGTAAAC 


TGCGAACAGC 


TAGAAGCTTA 


TCGTCAACTG 


GAAGAAGCTG 


1560 


p a pt* I* gtt nn 


ATGTTGGGCG 


CATGTGAGAA 


GGAAGTTTTT 


TGAAGTGCCC 


CCCAAGCAAG 


1620 


CAGATAAATC 


ATCCTTAGGA 


GCTAAAGGTT 


TAGCCTATTG 


TGATCAGTTA 


TTTTCCTTGG 


1680 


AAAGAGACTG 


GGAGGCTTTG 


CCAGCTGATG 


AACGGCTACA 


GAAACGTCAA 


GAACATCTCC 


1740 


AACCCCTACT 


GGAAGACTTC 


TTTGCTTGGT 


GCCGTCGTCA 


GTCAGTTTTA 


TCGGCTTCAA 


1800 


AACTAGGAAG 


GGCAATTGAA 


TACAGCCTCA 


AGTATGAAGA 


AACCTTTAAG 


ACCATTTTAA 


1860 


AAGACGGACA 


TCTGGTCCTT 


TCCAATAATC 


TAGCTGAACG 


CGCCATTAAA 


TCATTGGTTA 


1920 


TGGGACGGAG 


TAAAAGAGTC 


CAGTGGACTC 


TTTTAGCCTA 


AGCTCAGTTT 


AAAAAAACGA 


1980 


GGGTGGTTAT 


TTTTAAAAAA 


GCGAGGGTGG 


TTATTTTCTC 


AAAGTTTTGA 


AGGAGCTAAA 


2040 


GCAAGAGCTA 


TTATTATGAG 


TTTGTTGGAA 


ACAGCTAAAC 


GTCATCAATT 


ATAGTGCGTT 


2100 


GAATCTATAA 


CAGTACGCAT 


CGACTGCTAA 


AATATTTCTA 


TAAATCAATT 


TTCCTTTCCT 


2160 
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AATCGATTTG 


TTCATATCTT 


ATTACAATCC 


ATTATAAATA 


GCGAGAAATA 


TCTATCCTAT 


2220 


CTTCTAGAAT 


GTCTTCCAAA 


CGAGGAAACT 


CTCGTAAACA 


AAGAGGTTTT 


AGAGGCCTAT 


2280 


TTACCGTGGA 


CTAAAGTTGT 


ACAAGAAAAG 


TGCAAATAAG 


AAATCTCCAG 


ATTAGGAACT 


2340 


ATATATGAGT 


TCTCTAGTCT 


GGAGATTTTT 


CAATAGACTT 


CGTTATTGGG 


CGGTTACTTT 


2400 


CGAAACTTTG 


AAAACTTCAA 


AAAACGGATT 


TTTATCGCTC 


TGAACATCAA 


AAAAGAAAGG 


2460 


ACGAAATTTG 


TCCTTTCTCA 


AGCTTAGCTT 


TTCTTCAACC 


CACTACAGTT 


GACAAAGAGC 


2520 


CCTTTATTCT 


ATCAAACATG 


AAGCGCAAAA 


ACAAGCCAAA 


AATCCGATAG 


AATGGCTATC 


2580 


CCTCGACTAT 


CAAGTAAGAC 


ATTTCCATCA 


AATACGTTCA 


ATTTTACTCT 


TGTTCTACTA 


2640 


AGAATTAATC 


ATCTCGTTTT 


GATTTATTAA 


AAATATACAA 


TTCAGCTTTT 


CCTCCAAACT 


2700 


ATTTTATCCA 


CTATCCCTGT 


ATAGCTCTGT 


ATTATCTTAA 


CAACTTTAGT 


AGAGACATTT 


2760 


TCCTCAACAT 


AATCCGGAAC 


CGGTAATCCA 


AAATCCTCAT 


CTTGTGCCAA 


GCTAACAGCA 


2820 


GTTTCAACTG 


CTTGAAGAAG 


AGAATTTTCA 


TCAATGCCTG 


CCAAAATAAA 


TCCTGCCTTA 


2880 


TCTAAGGACT 


CAGGACGTTC 


TGTACTTGTA 


CGAATACATA 


CAGCGGGAAA 


AGGATAACCT 


2940 


TGACTAGTAA 


AGAAACTACT 


TTCTTCCGGT 


AAAGTTCCCG 


AATCAGATAC 


TACAACAAAT 


3000 


GCATTCATCT 


GTAAACAATT 


ATAGTCATGG 


AATCCTAGTG 


GCTCATGCTG 


AATCACACGT 


3060 


TTATCTAGTT 


TAAAACCGCT 


CTCTTGTAGC 


CTTTTCTTTG 


ATCTAGGATG 


GCAAGAATAT 


3120 


AAGATTGGCA 


TATTATACTT 


TTCAGCTAAT 


TGATTAATTG 


CTGTAAAGAG 


AGAAATAAAA 


3180 


TTTTTATCTG 


TATCAATATT 


TTCCTCACGG 


TGAGCTGAAA 


GTAAGATATA 


ACCTCCTTTT 


3240 


TTCAATCCCA 


AACGTTCATG 


GATATCTGAA 


GACTCAATAG 


CAGATAAATT 


TTTATGTAAC 


3300 


ACTTCTGCCA 


TAGGAGAACC 


AGTTACATAT 


GTGCGCTCTT 


TAGGTAAACC 


ACACTCATGT 


3360 


AAATACTTAC 


GTGCATGTTC 


AGAGTATGCT 


AAGTTAACAT 


CTGAAATAAC 


ATCAACAATC 


3420 


CGACGATTAG 


TCTCTTCCGG 


TAGGCACTCA 


TCTTTACAGC 


GATTGCCAGC 


CTCCATATGA 


3480 


AAAATTGGAA 


TATGTAAACG 


CTTGGCAGCA 


ATAGCTGATA 


AACAAGAATT 


TGTATCCCCT 


3540 


AAAATCAATA 


AAGCATCTGG 


TTTAATTTGA 


TTCATCAATT 


TGTATGAAGT 


ATTAATAATA 


3600 


TTCCCTACAG 


TAGCACCAAG 


ATCATCTCCA 


ACAGCATCCA 


TGTATACGTC 


CGGAGTGTCT 


3660 


AACCCTAAAT 


TATCAAAGAA 


AATACCATTT 


AAATTGTAAT 


CATAGTTTTG 


TCCAGTATGT 


3720 


GCCAAAATAA 


CATCAAAATA 


CTTTCGACAT 


TTAGTGATAA 


CACTACTTAG 


ACCTATAATC 


3780 


TCTGGACGTG 


TTCCCACAAT 


AATCAATAAC 


TTAAGTTTGC 


CATTATCTTT 


AAAGTGAATA 


3840 


TCACTATAAT 


CTGTCTTAAT 


TTTCATTTAT 


TTCTCCACTT 


GTTCAAAAAA 


AGTATCTGGA 


3900 
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TGTCTAGGAT 


CAAATGACTC 


ATTAGCCCAC 


ATGACAGTAA 


TTAGATTTTC 


TGTATCAGAA 


3960 


AGATTAATAA 


TATTATGTGC 


ATAGCCCGGT 


ATCATATGTA 


TTGCTTCAAT 


CTTATCGCCC 


4020 


GACACTTCAA 


AGTTCAGAAT 


AGGATACTCT 


TGACCGTTTT 


CATCCAGCCC 


TATCCTACGC 


4080 


TCTTGTATTA 


AAGCACGACC 


AGAAACAACC 


ATGAAAAATT 


CCCACTPAGA 


ATGATGCCAA 


4140 


i\j i 4v?www4 a 


TGGTAATGCC 


AGGTTTAGAA 


ATATTAACAG 


AOTATTGACC 


CGTATTTTCT 


4200 


CTTTTTAATA 


A TTPPRT AAA 


ACTACCTCGT 


TCATCTATAT 


TCATTTTTAG 


AGGAAACTTA 


4260 


/V\V» L 1/\X V. X f> 


CTGGTAAATA 


AGATAGGTAG 


GTAGAATACA 


ATTTCTTTTT 


AAACGATCCC 


4320 


Tearing attt 


CACGCATAAC 


TAAACTATCA 


GGCTGTTTTT 

V^Xrf X* A S# A- A A A A^ 


TAAATGTTTC 


TAATAGAGAG 


4380 


n^nn 1 v- iv.IV, 


CP A ACGTTC C 
V. i nnwvi X X u Vv 


ACGATGAGTC 


GTTG GT ACGT* 

«J A A *nj \J A w A 


AGCAGTAGTT 


TCCTGATGGG 


4440 


\- 1 Auu x nnwn 




ATCTAGATTA 


CAACGATGAG 


GATTTCCTTC 


CAATGCAGTT 

* mM 9> * ^^^^ fA># * * 


4500 






ATPAATATAP 


AGCAACTCCA 


ATTCTACACT 


TGGATCATTT 

A vWf * A » AAA- 


4560 


I I\x An 1 




APPTAfiATTA 


TAACAGAAAG 


TTGCTACAGC 


AGAATTGTAG 

J^k^J A A \^ A *A^J 


4620 






ATAAAGATTC 


GGGAAACGGT 


AAACTAAGAC 


AGGTGCTCCC 


4680 


tt 1 i 1 H- 1 1 1L 


PST1TWA41 




TCCCCTGCTA 


GCTTAGATTG 

W 4 4 *»*J*» 4 4 w 


TCCATATATA 


4740 


/vp*wp/* & & A 


/> 1 \_vjVj*_l». 1 iv. 


TAAA("TAGCT 
x n#v%^ x x 


TGAGTAGAAC 


TTGAGAGTAG 


AACAGGACAA 

■ **av* nwnv 


4800 


ulul 1 I lvni 


AL 111 1 v 1 An 


A ATPTfCAAT 


AATCTACTTG 

r^r\ a w 4 *»w 4 4 w 


AAAAACCGTA 


ATTTCCCTCC 


4860 




Uiuun 1 1U1 U 




ACACCAGCTA 

^%W«*W ^AW^ 4 


AATGGAATAC 

4^tw^ A ^J^^aar* A < 


GAAATCGGCC 


4920 




&TTP1TPT A A 
n x 1 i v_ i nn 


TAAAATCGGA 


TCKITATCAC 

4 W 4 X# 4 •> 4 


GATCATACTG 


AAAAATCTCT 


4980 


/W 1\« 1 win 


X 1 aVivnWVl 


ACSTCCT ATCT* 


CGTCCATCTT 

4 X«^v*»Ai * * 


TCAAAGCTTC 


CAGAGTACAG 


5040 


AT* A A/"l AT*PT**T* 


TTPrTAPAAA 


TCCTTTCGCT 


CCTGTGATTA 


AAATATTTTT 


AATCATGCCC 


5100 


f"*f"" TT*t "TT ATT* 




V VTT AAT ACT 


TAACTCTCTC 


GACAATACAT 


GATACATTAT 


5160 




T A ATnT' A A T 
k nn x i x x tvi x 


CPATCTTAAA 


AGATTTTACA 

rwn a m a a nw • * 


TCTCTTCGTC 


TGCTACCATA 


5220 


TCACGAATTG 


CTGTCTGTAT 


TTCATCTAAT 


TCTAGCAACT 


TTCTTTTAAC 


TTGCTCTACA 


5280 


TCCATCAAAT 


CGCTATTATT 


ACTATTGAAT 


TCTGTCAACA 


AATTTCTATT 


CGTACTACCA 


5340 


TCTTTGAAAT 


ACTTATCATA 


GTTAAGATTA 


CGATTATCAC 


TAGGAACTCT 


ATAAAAATCA 


5400 


CCCAAATCAA 


TTGCATTTGC 


GCACTCTTCG 


TTAGTTAATA 


GTGTTTCATA 


CCTTTTTTCT 


5460 


CCGTGTCTAA 


TACCTATAAT 


CTTAATATCT 


TGTTCTGAGG 


CAAAAATTTC 


TGATACAGCC 


5520 


TTAGCCAACA 


CTTCAATCGT 


ACATGCTGGT 


GCTTTCTGAA 


CTAGTATATC 


TCCAGATTTC 


5580 


CCTTCTTCAA 


ATGCAAATAA 


AACCAAGTCT 


ACTGCTTCTT 


CCAATGTCAT 


CACAAAACGT 


5640 


GTCATGCTAG 


GTTCAGTAAT 


TGTAAGAGCA 


TTTCCTTGCT 


TAATTTGCTC 


AATCCAAAGA 


5700 
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GGAACGACAG 


ATCCACGGCT 


ACACAGAACA 


TTCCCATACC 


GAGTCACACA 


TATCTTTGTA 


5760 


TGCTCAGGAT 


TTACCGTCCT 


GGACTTAGCA 


ACAGCAATCT 


TTTCCATCAT 


AGCCTTGGAT 


5820 


GTTCCCATAG 


CATTGACAGG 


ATAAGCCGCC 


TTATCTGTAG 


AAAGACAGAT 


AACTTGCTTT 


5880 


ACACCAGCTT 


CGATAGCCGC 


AGTGAGGACA 


TTCTCCGTTC 


CCAAAATGTT 


AGTTTTTACC 


5940 


GCTTCTACAG 


GGAAAAATTC 


ACAAGAAGGT 


ACTTGTTTAA 


GAGCAGCAGC 


GTGAAAAACA 


6000 


TAATCCACAC 


CATGCATAGC 


ATTTTTTACC 


GAAGCTAAGT 


CACGCACATC 


TCCAAGGTAA 


6060 


AAACGGATTT 


TCCCAGCCAC 


TTCTGGTACT 


TTTACCTGAA 


ACTCATGACG 


CATATCATCT 


6120 


TGTTTCTTTT 


CATCTCGCGA 


AAATATACGA 


ATCTCTGAGA 


CATCTGTTTC 


TAAAAAACGC 


6180 


TTGAGAACCG 


CATTCCCAAA 


TGAACCTGTC 


CCTCCTGTAA 


TTAGGAGAGT 


TTTTCCTGTA 


6240 


AATTGTCACA 


TATATTACAC 


TTCTCCTTCT 


AGTATGTCTG 


CAATTTTCTT 


ACAAGCCGTT 


6300 


CCATCTCCAT 


ATGGATTTGA 


AGCTTGACTC 


ATTCCTTGAT 


AAACTGAATC 


ATTTTCTAAT 


6360 


AATTCTTTAA 


AATGCCTATA 


AATATTATTT 


TCATCAGCAC 


CTACAAGTTT 


CAAAGTCCCT 


6420 


GCTTCAATTC 


CCTCTGGACG 


TTCACTTGTA 


TCTCTCATAA 


CCAAAACAGG 


TTTTCCTAAA 


6480 


CTTCGAGCCT 


CTTCCTGAAT 


ACCACCACTA 


TCTGTTAAAA 


TTAAATAACT 


TCTTGATAAA 


6540 


AAATTGTGAA 


AATCTAATAC 


TTCTAAAGGT 


TCGATCATCT 


TGATACGTTC 


ACAGCCACTT 


6600 


AGTTCTTCCT 


CAGCAATTTG 


GCGAACACGA 


GGATTCATAT 


GGATAGGATA 


AATAGCCTTG 


6660 


ACATCTGAAT 


ATTCTTCAAT 


AATCCTTCTA 


ATTGCTCTAA 


ACATATGTCT 


CATCGGTTCA 


6720 


CCAAGATTTT 


CACGACGATG 


AGCTGTAATT 


AGAATAAACC 


TGCTTTCTCC 


TATCCATTCT 


67B0 


AACTCAGGAT 


GCGTATAGTC 


CTCTTGAATT 


GTAGTTTGTA 


AAGCATCAAT 


CGCCGTATTA 


6840 


CCTGTCACAA 


ATATGCTCTC 


TGGAGTTTTT 




AAAC ATT ATC 


TTTTGAAAGT 


6900 


TGTGTTGGTG 


TAAAATGATA 


CTGAGCCAAA 


ACCCCAACTG 


CTTGACCATT 


AAACTCTTCA 


6960 


GGATATGGTG 


AATAGATATC 


GTAAGTGCGC 


AAACCAGCTT 


CAACATGACC 


AATTGGAATC 


7020 


TGTAAATAAA 


AGGCCGCCAG 


TGAACTAGCG 


AAGGTCGTAC 


TTGTATCCCC 


ATGAACTAAC 


7080 


ACCAAATCAG 


GTTTTTCTGA 


CTCTAAAATA 


GCCTTCATTC 


CTTCCAAAAT 


GCCAATGGTC 


7140 


ACATCAAATA 


AAGTTTGTTT 


ATCTTTCATA 


ATAGACAAAT 


CAAAATCGGG 


AATAATCCCA 


7200 


AATGTGTCCA 


AGACCTGATC 


CAACATTTGA 


CGGTGTTGGC 


CCGTAACGCA 


AACTAATGTT 


7260 


TCAATATTCT 


TACGTGTTCT 


TAACTCTTTG 


ACCAAAGGAC 


ACATCTTGAT 


GGCTTCTGGA 


7320 


CGAGTTCCAA 


ATACTACAAC 


TACTTTTTTC 


ATATATTTAC 


TTACTCCTAA 


CAAATAATGA 


7380 


ACGGTTCTTA 


AAATAAATTA 


GATAACGGCT 


AATCCATAAC 


ACCACCTCAG 


ACATACTTGA 


7440 



WO 98/18931 



PCIYUS97/19588 



352 

ACAAATAGCT AATGTTACTA AACTAAAATT ATCAGACAAG ATAAATATTC CTAATCCCAA 7500 

AGTTTGGACA ATCGAAGCTA ATATAGTTGT CATTGTACTT TCTTTCACTT TATCAATAGC 7560 

TCCTAAGACA GGCCATCCGT AAATCATAGA ATAAAAACTA GCAACAAAAG CGGGTAATAA 7620 

GTACTTAAGA AAATCTGCTG AAACGGTATA TTTTTCACCA CCAATTATAG AAAGAATTTG 7680 

ATTTGAAAAG AATAAAACTA TCAAAACTCC AAAGATAATA GGAATAAACA TAATCCGATT 7740 

AATACTCTTA ACCGATTGTA TATCTTTAGT ACGTATCATA TGCGGATATA AACTATTCGC 7800 

TATAGGATTA TACAATGATT TTCCTGCTGA AAGCAGTTGC ATTGCTATCC CCCAAAAGGC 7860 

TATCTCTTGA CTTTGTAAAT AAAAACCCGA AATGACTGTC GTAAAGACGC CAAAAATAGT 7920 

AGTTGCAAAA TTGGATAAAA AATAAATAGA GGATTCCTTT AAATCTTTAA CCCAAACAGA 7980 

CAGATAAGAA AATGATAATT TAATTCCATA ATAATGAAGG AATCTATAAG AAACTACTGC 8040 

AGCAACTAAA TTCCCAATTC CTTCCAATAT AGGAATCCAT AAAATAGAAG AATCATCTTT 8100 

TACTACAATA AATGTCAAAA TTGTAATGAT AGTTTTAGAA ATAATATAAG GAATTGCAAC 8160 

TGCATGCATC TTTTCAATTC CACGAAATAA AAAGTCAAAG ATAAAAATAT TGGTCACTGT 8220 

ACCTAACAAA TAAAAAACTG AAAAAAGAAT ATTCTCTCTC ATTATTGGGA TTTGCCACAT 8280 

CAATATGGTG TAAATTAGAA TCGAAATGAT AGATAAAAAT ATTTTTTCAA CTACAGTATC 8340 

TCCAACTATC CTTCCAATCT TTGAGGGAGT AGTACAAGCA TTTACAATAT TTTTTGTAGC 8400 

TGATATCATG AAACCAAAAT CAATCACCAG TTGAACATAA GCTATTAACG CTTTAACATA 8460 

AATAACCATT CCATACGCGT CTAGCGAAAG CACCCTTGTC AAATACGGGA GTGTTAATAA 8520 

AGGAAATAGT AATTTAACAA TATTCAGAAT ATAGAGAGAA CTTGTATTTT TTATAAATGA 8580 

AATTCTATCA ACTTTCACGA ACTAGTCCTT CCAAAAAAAG ATCTAAATAG TCCAAACTAC 8640 

TTCTCGCTTT CAACACCAAT TCTGAAGGTA TTGTTATCGG TTTTAGATGA AAAGTTTCAA 8700 

GTTTCTTTAC AATACTATTA ACACTTGAAT CAAATAAAGA TTCACAACGT TGTAACTCTC 8760 

CAATTGCTCC ATAATAACGT GCTGTTTTTT CTGCATCCCA TGCAATGGCA ATC AC AGAT V 8820 

TATTAAAACA TGTTGCCACT ACCCCAACAT GTAATTTACA AGTTAAAACC ACATCTACCA 4880 

TTTTCAACAA TGATGTCATT TCTGCAGGAG AATGATACTT GAATTGAAAA CAATCCTCAG 8940 

TTCTAACTAA TTTTCTAAAT TCCTGATAAT AAGCATCTTC ATAAGGTAGA ATGGAATCCG 9000 

AAGTTACTAC AACATAATAG TTAGGATTGT TTTCTAGAAA AAGACTAATT GATTCCGCAA 90 60 

ATTTTTCAAG AGCTTTTTTG GAATGATTAT AGTGAACAAG AATTATCTTC TTATCTTTAG 9120 

CTTCTCTTTT CAATTGACAC AGCTGCTCTG TTTTTTCTTC TCTTAATTTA CTTGAAATAA 9180 

TTAAATCAAA GGTTTCATGC ACTGGAGCCG AAGGCGACAA ATGCTTCAAA GAATCAAATG 9240 
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ATTCTCGATC ACGAACTGTA ATAAATTGAG CATGATTAAT AATTCTCTTT ATACCATAAT 9300 

TCATCAAAGA ATCGTTATTA GGCCCTGCAC CAATACCTAA TACTCCTATA GGCTTTTTAA 9360 

AATATGAAGC CCAAATTCCC AAAGGTAAAA ATCGTTTAAA TTGGATTAAA TTATCACGAA 9420 

AACGTGCATT ATGCCCTTCC CCAAAATATC CTCCCGGGAT ATACAAAATA GCATCTGCTT 94 80 

GTTTTTTAGT AAAACTTTGT TTTTGGCGAT ATTCTTTCAA GTACATTTGA AAGAAATCTG 9540 

ATGGATTATA AAAAGAAACT TCATATCCTT TAGATTCTAA TAAATCATAG ACAATCTCAC 9600 

CGTAAAGATA ATCACCGTAA TTACTTGAAC CATAATCCGT TGCACCATGT AACATAATTT 9660 

TTTTCACCAC TATTTTTTCA ACCTCCTAAA AATAAATATC ATAATCAAAC TATACATAAT 9720 

AGGACGATAA ACATCTATTG AACTACTTCT CACTAAAAGC AATAGTTGAG AAATTACCGA 97 BO 

AAAATAAATA ACTTTTGAGA TTTTACTTGT TTGAAAAGCT CTGAAATTTA ATCGCCATCC 9840 

ACTAAATATT CCCAAAACAA AACTCCAAAA AACACCACCA TAGTAACCAA ACTTCCAAAA 9900 

TAATTCTTCC ACAAAAGAAG AGCCTACAGG TAACCCCAAA AATTTATTAA TAACAACCGT 9960 

CGCTGATGCT TTATCAAAAA AATCACCAAC TAACCATCCA ATAGGAAAAA TTGATAGGAT 10020 

AGTCCGTAGA AATGTCATCC CATATTCATA TGGAATGCTA CTAGGCACAA CAGTTACAGC 10080 

AGAAGCTACT GTTAGGCTGG TCAGTCCCGA CTCTGAAAAT ACTTCCCCTA CTATATTCTT 10140 

TACAAAATCT AATGAAGAAA AGGAATCAAA TAAGTATATA CCTATAGTAT TCAAGTCGAA 10200 

ACGGTGCCCC CTAATAACAA CTAATACATT TAATAGAAAT ACAGTTACTA TTAAAAATAC 10260 

AAGTACTCTT TTCTTCGAAA AAGTAATCCC TAAAGATTGT GTGTATACTA AAACCAACGC 10320 

CAAGATTGAA AACACCTGGA TTTTACGACT TCCTGTTAGG ATCATTATCA AAATTAGGTA 10380 

AAACAACATT ACCCAAAAAA TAGTACGCTT TATAACTCGG GACAGCTTAT CTGAATAAAA 10440 

CAAGGAGAAC ACACCAGGAA GCATAAGTAC TCCTAAATCA TCTATTATTC CTGAACTAGC 10500 

TGCCTCTGAA TATGCTGAAT AGCTATTCGC CGCTCTAACT GCTAGTACTG TTTTAGAATC 10560 

AGTTATTACC CTAGAAATAA AGCCCACTCC TGTTAAAATC CTACCCGCAT TGTACAAAAT 10620 

TTTCTCTTCA TTTTCCTGAT AATTTTGTAC TTCTGAATGA TAATGTACCT TTCCATCACT 10680 

ATAAAAAAAT AAATAGCCTA CAGAATAACA AAACAAAATC CAAATTATAA AAATATATGA 10740 

ATGAAATAAT TCTTCATTAT TATAGAAGTT ACTAGGGCTC CACAGCAGAG TTGTTTGAAA 10800 

CCCCATATAC TCATTGAAAA TTAATCCAAA CATAAAAAAA TAAGATAAAA TCAGATACCA 10860 

TACAGAAAAA TCATATATAC TAACTTTTTG TAAAATAAAA CCAGTAATTT GAAAAATAAT 10920 

TAGAAAGCAA ACCCATATAA ATATAGACGG AACATAATTA GATATAAGAA AACCATTATT 10980 
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CCAATTATCG AGAGTCCAGA ACAAGTAACA GAAAGCAAAT ATA AJ ^ CTTA ATGTCACTAG 11040 

TGTCACTCTA CAAATATACT TTGTCTGCAT CTATATCTCC TTTATTACAC ACATTTCTTG 11100 

ATAACGATTC AATAATTTAC TAGCTTGATA ACAAATATCA TAGAGTCCAT CTGTCATACT 11160 

GTTATTTATT TCAAAACGAT TGCATTCCTC AGATGTTAAA GACAGTACTT TATCTTTCCA 11220 

TAGCAACACA GACTCTTCGT TGATAGGTAA GTAACTAATG TTTTTGGTCA CATCTACTTC 11280 

TTGCGTCACT CTATCTGACG ATAAAATTTG TAATCCCGAT GCCTGAGCCT CTACTAGAGA 11340 

AACAGGCAAC CCCTCATATT TAGACGGAAG CAAAAAAACA TCCATCGCAG ATAATAAATC 11400 

AGAAATATCA GTCCTTCTCC CTAAAAATAG CACATATGGG GTCAGATTTA GTTCTAAAGC 114 60 

TTTCTGTTTT AATTTCTGCT CATCCTCACC ATTACCAACT AGGAGTAAAA TAACATTTGG 11520 

TTTGATTAAA ATGAGTTCTT TTAAAACGTT AAATAAATAA CTTTGGTTTT TTTGATCTGA 11580 

TAGGCGAGCT ATATTTCCTA ATACGAACTT ATTTGACACA TCTAATTCTC TACGACATTT 11640 

TTCTCTAACA TCTGACAAAA ATTGATACTT TTTCAAATCA ATTGCATTAA AAATAATTTC 11700 

AATTTTTCCG TCTTTATACG CTTTCTCTCC ATATAACCAC TTAGCCGAAT CTTCCCCACA 11760 

TGCAAACCAA TGAGTTGCTA AGATTTTTAC CAAAATTGTT ACTAATTTAC GCAATACTTT 11B20 

TTGAAAACTG TTTTCTGTTA CATAAGCCAT ATGACTATGA ATAATTCTAA TTTTACAACC 11880 

AATTATTTTA GATAAGATCA GACCAATTGC AGATTTATAG CCATGGCAAT GAACTATATC 11940 

ATAATCTCCT TTCTTTATTA TTCTAGCAAG AGAGAGAAAC TGATGTAGAG GCTTTTTCCT 12000 

TAATAGAGGC ACATGATAAA CCTTTGCACC CAATTCTTTC ATTTTATCCT CTAAAAATCC 12060 

TTGTTCTTTT CCAGGCACAA TAAAATCAAA TTGAATTTTT TTTCTATCAA TGTGAGAATA 12120 

ATAGTTGAAT AGAAAACTTT CTACTCCACC ACTATCTAGT GTTGTAAATA GATGTAATAC 12180 

TTTAATCATT CTTCTTCCTT AAGCTTAAGA TTCGCTTCTC TAATTCTA7T TCTGTTTTTT 12240 

GTTTTTCTAA ACTAATTCTG TCCATGAAGT TATCACAATT CTTAATTAGC TGTTTCCTGT 12300 

CAAGGTTTTG AATATACAAA GCCAAACAAT CTTTTTCCGA TTCATCCTTC ATAGGTAAAA 12360 

CGAAACCAAA ACCATTCTCT ATTGACACTT TTTCCATATA AGTATCTTCA CAAACTAAAA 12420 

TAGGTTTATA CAACAATGCA GCAAAGTAGA GTTTATTAGA CAAAGCATAG TCTAGTAAGG 12480 

GAGTGTGATT CCCGTATAAA TTCAAAACAA CATCTGTATT CTTATAAAAA GACATGGTAT 12540 

CTTTAGGCTG GAATGTGTCC ACCAAGTTAA CATTGCTGAT ATTTTTTTCT TGACAAAATT 12600 

CCCTTAATTC TCCTGCATTA GTACCTATAA AATTCAACTG AAATCGACTG TCATTTGCAA 12660 

AAAAATCGAT TATTTTTTTA T T W rT C TT GAAAACGAAT TAAACCAATG TAGGAAAGTT 12720 

GAATTGGAAA CGTACTATTA TTTTTTAACT GCTTTACCTC GTTTAATTCT ATCATATTGG 12780 
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GTAGGTTATG GGTAGTAAAA TACTCTCCCA TTGGTAAAAA AAATTTATAG CCGTCTGAAG 12840 

AAACGATATT CATTAAAGAA TTTTTCACCA ATTGTTTCTG AACCAAACGA TAAACCAAAA 12900 

ATTTTTCATA ACTGTAATCA CGAATATCAT AAATATATCT ATTTTTAAAT GAAAAGAGAA 12960 

GAAAATCTAC TAAAATGAAA GACACAATAC TATGTAACGG CAATATCATA TCATAATCAT 13020 

TTTCTTTTAG CTTCTTTTTA ATTTCTTTTC TGAATTTTAC ATAACCTAAT ATCTTACTTA 13080 

ATTTTCCTTT ACCAGAAAAA GAAATACGAT AGTAGTTTTG TTTTGTAATA ATCTCGTTAA 13140 

TATTCTTATC CCAATATATA ACATCGTAAC TAATAGACAG TTTCTTCAAT AATTCTTTAT 13200 

AAAAATTGAA GTAAGGAGTT AG AT AT AT AT TATCAGATAG TATAAACAGT ACTCTC ATT A 13260 

AATTATTCTT TCTTACTTTC CCTCTCTAAA CATGTCTCCA GTTCGAGCAT AAACTGCTCT 13320 

TTTGAAAAGT GATTTTCATA GTAACAACGA GCTTTCTTTC CTAACTCTCT TTGTCTCTTA 13380 

ATAGATAACA TACTAAATTT ACAAATATTT TTTGCCAATT GTTTTACATC TCGTTCGCGA 13440 

CTAACATATC CACAATTTGC TTCTTCTACA ATTATTTTAC CATCTCCTGA AATTGCACCT 13500 

ATAATTGGTT TGCCTGCCGC CATATAAGAk TGTACCTTCC CAGGTATAGT ACGAGAAACT 13560 

ATCGAGTCTC CTATTAAAGA AACTAACATA GCATCTGATT TTTTATAGAA GGATGGCATT 13620 

TCCTCCAAAG AACGTCTTCC ATAGAAGGAA ATATTCTTTA ACTCCAATTC ATGAGCTAAT 13680 

GCTTTCATGC TTAACAATTC CGTACCATCT CCAACAAAAT GAAAATGAAT TTTCTrGGGT 13740 

AAATTGGTAT TCTTCTCTAT CAAACTGGCA GCTTTCAAAA TAGTTTCCAA ATTTTGTGCT 13800 

TTGCCAATAT TACCAGCAAA AGTTAGGTCA ACACTTTCTT TATTAACTAT AGATTCATCA 13860 

GGGATAAAAA GATCTTCTGC ATATTGTGGC AAATATGTAA TCTTTTCTTC GGATATGTCA 13 920 

AATTGCTTCA CAAAATAATT TTTAAATGA? CGACTAGTGA CAAA7ATATA ATCACTAGCT 13930 

CGGTAAACTT TTTTTGAGAT AAATTTAAAC AGCTTGAAAA TCAACCCATC TTGTTTCACT 14040 

CCACCTACGG TTAAACTATC TGGCCAAACA TCCATACAAT ATAGAAACAT CGCTTTCTTA 14100 

TATTTTTTTT TATAAGCCAT ACCAGCCCAT GCCATCATAA CTGGAGACAA TTGGTTAACG 14160 

AATACACAGT CAAAATTCGA TCCATCTTTC GTTTTATACC TCCCCAATAA AACTCCTAAA 14220 

GTAGAACTAA TTGCAAAGCT AAAATAATTC AACAATCGAA ATACAACACT TTTTTTTCTA 14 280 

GGGATTGTAT AAGAACGATA TATCGTAACA CCTTCTATAA TCTCACGTCT TTTTTTATTA 14340 

TGACGATAAT CTGCATATAT CTTCCCTTCA GGGTAATTAG GAATCCCAGC CAAAACAGAG 14400 

ACTTCATGCC CTTTTCGAAC TAAATCTTCA CAAATATCTG ACAACCTGAA TGGTTCTGGC 14 4 60 

TTATAATGTT GGCAAACAAA TAGTATTTTC ATTGTCCAAT TTAACTTTCT TTCTTACCAC 14520 
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TACCCTCTAC AATACCTTTT CGTTTCAGTA CGTAAGGTAT TGTCTTAACT ATACATCTAA 14 580 

TATCCATTAT CAAAGACAGA TGTTTAACAT AGTAGCCATC TAACTCCGTC TTCATCTCAA 14640 

CAGACAAAGT ATCACGCCCG TTAATTTGTG CCCATCCAGT TAACCCTGGC AAGATATCAT 14700 

TTGCTCCATA CTTATCTCTC TCTGCAATCA AATCTAGTTC ATTTATACCC GCTGGTCTAG 14760 

GACCTACAAT ACTCATATTA CCAACAAGAA TATTAAACAA TTGTGGTAGT TCATCCAAAG 14820 

ATGTTTTTCG CAAGAAAGCC CCTACTTTTG TAATCyATTG CTCTGGATTA TATAAGTTTC 14880 

GAGGCGCCAC ATTTTTAGGT GCATCTATTT TCATAGACCT AAATTTCAAA ATATAGAAGT 14940 

ATTCTTTATG AATACCAAAG CGTTTTTGCT TAAATATAAC CGGACCTTCT GAATCAAGTT 15000 

TAATCGCAAT TGCAATTATC ATAAAAACCG GACACAATAT TATTATCCCT ATTAAAGATA 15060 

ATAATATATC ACCTAATCGT TTTATTATAC CGTACATAAA CAACCTCCAA CTATAAATTC 15120 

TATTTCCATT TTTCATTCTA TTTCCATTTG ACAAATTAAA TCAGGCAGTA CATGCAACTA 15180 

CAGAAACTCA ATATATATTT GGTCACTCAA TGATTTTCAG AAATATAATT CTTTTATCCT 15240 

CTACGTCAGA TAAAACTTTT CTCCATCTAA ACAAAATTTA TTTGTTTCAG TAATATATGA 15300 

GTTCTCAATA ATGAATTAGA AGGTCCAGTT CAATTATTCT TCCAAATAGA CCGAATATTA 15360 

TTTGAAGACA TATCGGTTTC TGAAATTGCA ATCAGTACAT AAGCTAATAA ACTGATAAGT 15420 

ATGCTCTGTA AGAATGCCAG AGTTATATTG TAGTCCCCTT CCATACTATA TTCATTTTAT 15480 

TTTTTACCAT AATTTCCATA GGAACCGTAA ACT CC AT ACT TATTAACCGA GATATCCAAT 15540 

TTATTTAAAA CAACTCCTAG GAACAGTTTC CCTGTTTGTT TTAATTGTTG TTTCGCTTTT 15600 

TGGATATCAC GTTTATTCGC CTCACCTGTT GCTGTTACCA AGATGGACGC ATCACACTTT 15660 

TGAGTGATAA TTGCCGCATC AATAACAATT CCAATAGGCG GTGTATCAAT AATGATATAA 15720 

TCAAAATATT TACGCAATCT TTCAATCATA TCATTAAAAT TTTTACTTTG TAACAAGGCT 15780 

GTAGGGTTTG GTGATACAGA TCCCGATTGA ACTACAAATA AATTTTCAAT ATTTGTATCA 15840 

CATAAACCGT GAGATAAATC AGCTGTCCCA GATAAAAATT CTGTTAGCCC TGTAATTI IT 15900 

TCACGAGATT TAAAAACTCC TAACATAACT GAATTTCGAG TATCGCCATC GATCAAAAGA .5960 

GTTTTATAGC CTGCACGCGC AAACGACCAT GCTATATTTA TGGAAGTAGT TGTTTTTCCT 16020 

TCCCCAGGGT TAACAGAAGT AACGGAAATT ACTTTTAGTT TATCTCCGCT CAACTGTATA 16080 

TTTGTACACA AGGCATTGTA ATATTCTTCT GCCTTCTTAA TGAACTCCAG TTTTTTTTGT 16140 

GCTATTTCTA ATGTCGGCAT CCTTCTCTCC TATTTCAACT TACCCAAGTT TGGCACAACT 16200 

CCCAAAAGTG TCATCTGCAA TGTATTTTCG ATATCTTCCG GACGTTTCAC ACGAGTATCC 16260 

AAAAGTTCAA GATGAAGAAC TATAACACTA GTTCCAATCA CCCCTGCCAA AAAACCAATT 16320 
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AGTGTATTGC GTTTAATATT TGGCGAAGAC GGGGATATCG CCGGCCTTGC CTCCTCCAGT 



AGTAATACTG ATAATTTTTT GAGCAGCTAC TTCTCTCAAA 
CTCTTCAGGA ACTCGATCAT TAACTGAAAT AGAGACAATA 



GTTGTCACGT CAGAAACACG 
GAGTTAGCGA TACGGCTTGC 
CGGGTATCAA CTGGTACTGT CACTTTAATT TTATTAGCCA AACCTTTTGG CCTCAAATCT 
AGTTTCAAAT CAGAAACAAC TTCCTCCAAA ACATCCTGCG AAAGGATAAT CTCACGGTAG 
TCTTTTACCA GATAAGTTCC TGCCTGCAAA TCCTGATTTG TCAACCCCGG CTTGTCTCCT 
TGATTGCGAT TCACTACGTA AATTCGCGTG GTACTCGTAT ATTCTGGCTT AACAATAAAA 
CTGCTATATC CAAAAGCCCC CGCACCTGTC ACAAGTGCCA CTATTAAAAT CATTAGCTTG 
CGTTTCCACA AGCTTTTAAC TAATTGAAAT ACATCGATTT CTATCGTATT TTGTTCTTTC 



TGATCCATTA CAATTTTTCG AGGATTGTCT ATAAAAAGTT 
TATTTTTGGG TAACAAGGTC ATATGCTTCT GCCATATGAG 



ATCATTTCTC CTAAATTAGT 
CCTGAGCCTT CGCTTCTCCG 
GAGGTCTACC GTCTAGATTG TGCATATCAC TTGCAATGAC ATGAACCAAA TCCTGCTCTA 



TTCATGAATT TATAACGTTC GCCAAAAAGT TTGGGTTTGA 
TGCGTGTAAC ACCCCATATC GATCAGTTCT CGAACGCGTT 
TCATAGCGCT CAATGTGGGC AATGACTGGA GTAATTCCCA 
GCGCTATGAA TATCGCGATA AGGAGTGTTC ATACTAAACT 
TCATTGAGGG TCGGAATCCG CTTTTTTTCC ACCTTATCCA 



AAAAATACTG AGCTCTTTTT 
GGACATGTGA ACTATTTACT 
TTTCATTATT TTCAAGAGCA 
ACATCAAGAT CTTGCTCAAG 
CTATCAAGGC ATAACGACTA 
GAACATCTGG TGTGTAATAA ATTTCACCCC CGTAAGCAAT GACCAAGTCA CTCGCCACTT 



CCTTAGCTAT TTCCCGAACC 
ACATGCCCTT GCGACGGTGA 
CTGCCAAGAG AGCCTTGCTT 
TATGCGAATG GATGTCTATC 
ACTACAGCTA AACTACTATC 



TGAAGAAAGT TTTCTGCTAT CTTCTCTTCC GGAGTTTCAA 
GAGGTAGAAA CAATGGTTCG CACCCCCTGT CTGTAGGATT 
TCCTCTCTTG ACTTGGGACC GTCATCTACA TCAAAAACGA 
ATTTCATCTA CCCTCCATCA CATCCTGTAT AGCTGCTTTA 
ATCTATTTCC ATCACATAGA GGTTACTGTC TGGCATTGCA 



TAAGAAGGAA GATCCATCCG ACCTGTCCCT TTTAAATCTT GAGAATTTAC TTTATAATTC 



CCTCCACTTT CTAACTGAGC 
TGGATAGAAT CTTGCAAGCT 
GTTAATTTTT GAAGGATAGC 
CCATCTGCTA GGGAGTAGCG 
ACATTGCCTG CAGGGTAATA 



ATTGACCAAA TTTATCATGG TCTCAAGTGC CATATTTGTT 
ATTAATGATC GTACTATAAT TTTTCAGCAC TTCGGTTGAC 
CACAATCACC TTTTGTTGAT GGCGCCCGCG GTCACGATCG 
CTCACGAACA AAACCGAGAG CCTGTTCTGA ATCAAGATGA 
CTTTCCATTC GTATGGGCAG TAAATTCTTG ATCATTATAA 



16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16B60 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 
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18060 
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ACATCAATTC CACCCAACAA ATCAATCAAT TTCAAAAACG AAGTGAAGTT CAATCGCACA 18120 

TAGTAATTGA TATCCACTCC ATAGAGATTT TCTAAGGTGT GAATGGACGA ATCAACTCCA 18180 

TAAATGCCCG CATGAGTCAA TTTATCTTTT TGATTATTTC CACCATCTGC GATTGGTACA 18240 

TAGGCATCAC GTGGCGTTGT GGTCAAGAGG ATTTTCTTGG TATCTCGATT GACAGTCATC 18300 

AGGATGTTGA CATCTGATCG CGACACCGAA CTAATAGGAC CATAGGTGTC AATTCCACTA 18360 

AC AT AG AT AT TGAAAGACTG ACTCTTAGAC GTCTTAGGAG CTTCTACTTT TTTAGTGAAT 18420 

CCCTTAGTAT AAATCTTTTT TATCTTCGAT GCGTAGTCTG CATACTCTGA CTCGATGATG 18480 

TTTTCAAAGA CACTATTTAG GACAATGGCC TTAGTCTCCC CTGCAATCAA ACTCTTGTAA 18540 

GCTGCCAAGT AAGACGAACT CTGGTTGACC GTCAAATCGG TATTCTGACT TGACTTGATA 18600 

TCAGCTAGTA ATTTCTGAAT ATTTTCATTA TTAGTCCCAG TCGGTGCTGT CACACTCGTC 18660 

AGTTGCGTAA CATTTTCGAT CTCACTATCT GCTAAAACAG CGACACTGAT TGAATATTCT 18720 

GAGTAATTAG AAGTCGCATT TAAACGATTG GTCAGTCCAA CAAACTGCTG TACTGCAAAG 18780 

AGCGACACAG AGCTGACAAG GATAGAGAAC ACCAACAGAA AAATAGTAAA CTTTTCAGCT 18840 

TTTTTATAGA TAATCAAGAG TAGCCCTACC AAGGCAACTA GTAGGACTAA CGCAGTTACC 18900 

ACTAGATTAA GATATCTAAA AGCAAGGATA TTGTACTTAA AGATTAAGAA CAATAAAAAA 18960 

CAAACTAACA ATAAATAAAT AGTCAGCAAA ACTATATTAA CACTTCGCTT CACTTTCTGT 19020 

GAACGTGATT TTTTAAAACG TCTACTCATG ATTAATACCT ATACATTGAA CATTATACGA 19080 

TTATATCACT TTTTTACGGT AATGTCTACA CCTTTATTTT TACTATCTGC ATCTTTAAGT 19140 

ATCTTAGTAG ACTTCCCGCG AAACAAAAAT ATAGTAAAAT GAAATAAGAA CAGAACAAAT 19200 

CGTTCAGGAC AGTCAAATCG ATTTCTAACA ATGTTTTAGA AGCAGAGGTG 19250 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 21706 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDN ESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AAAGTTGAAA GACTGCTAGC TGTTTTTGAT ACCAATCGTT TCCAACTACA GAGCAAACAG 60 

TATACAAAGT TTGTTTTTGG ATGTAAGCTT CTTGATGGAC AATTCCAAGA AAATCAAGAA 120 

ATTGCTGACC TTCAATTTTT TGCCATTGAC CAACTGCCGA ACTTATCTGA AAAACGCATT 180 

ACCAAGGAGC AAATAGAGCT TCTTTGGCAG GTTTATCAAG GTCATAGGGG GCAATATCTT 240 
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GACTAAGAAG ATGATTATCG TATTTCTAAA TCCATTTTTA ACAACTAGCA TGGTATAATA 300 

ATATGCAGGA AAATTTTGAA TTATGAGGAA GACTAGATGA ATTTATGGGA TATTTTCTTT 360 

ACGACTCAGG CAACCGAGCC GCCCAAATTT GACCTTTTTT GGTATGTTAG CCTATTTACG 420 

CTCTTAGCCT TAACCTTTTA TACAGCCCAT CGCTATCGTG AAAAGAAGGT TTACCAACGA 480 

TTTTTCCAAA TCTTGCAGAC TGTTCAGTTA ATCCTTCTTT ATGGTTGGTA CTGGGTCAAT 540 

CATATGCCAC TGTCAGAAAG CCTACCCTTT TACCATTGCC GTATGGCTAT GTTTGTGGTA 600 

CTCTTGCTTC CTGGTCAATC CAAATATAAA CAATACTTTG CATTATTGGG AACATTTGGG 660 

ACATTAGCAG CCTTTGTTTA TCCAGTGCCA GATGCTTACC CTTTTCCACA TATCACCATT 7 20 

CTATCCTTTA TCTTTGGTCA TTTAGCACTC TTGGGGAACT CTCTAGTTTA TCTATTGAGA 780 

CAGTATAATG CGCGATTGCT GGATGTGAAG GGAATTTTTC TCATGACCTT TGCCCTAAAT 840 

GCCTTGATTT TTGTGGTCAA TTTGGTGACA GGTGGCGATT ACGGATTTTT GACAAAACCG 900 

CCATTGGTTG GGGATCACGG TCTAGTAGCT AATTATTTAC TTGTTTCAAT TGTGCTGGTA 960 

GCTACTATCA GTTTGACTAA GAAAATCTTA GAATTCTTTT TAGCTCAAGA AGCAGAAAAA 1020 

ATGATTGCAA AGGAAGCTTA ACACAGAGCT TTCTTTTTTG CTCTTAGAGA GTTTTTACAA 1080 

GCAGCTTATA AAATAAGAAT TTCTGAATAG ACAAACTCAA AAAATGGCTG GGAAATTTAG 1140 

GAAAAAAGCA AGCACGATTA AATTTTTTGT GTTATAATAT TTTGTGAATA GCTATGCCTA 1200 

TGTTTAGCTA TGGAATAATA CGAAGTGCGA AACTTGCAAC ATAGAGAGGA AGCGATGTAA 12 60 

TGGCTAGAGA AGGCTTTTTT ACAGGTCTAG ATATTGGAAC AAGCTCTGTC AAGGTGCTTG 1320 

TGGCCGACCA GAGAAATGGT GAATTAAATG TAATTGGCGT GAGTAATGCC AAAAGTAAAG 1380 

GTGTAAAGGA TGGAATTATT GTTGATATTG ATGCAGCACC AACTGCTATC AAGTCAGCCA 14 40 

TTTCCCAAGC GGAAGAAAAG GCAGGCATTT CGATTAAATC AGTGAATGTC GGCTTGCCTC 1500 

GTAATCTTTT GCAGGTAGAA CCAACTCAGG GGATGATTCC AGTAACATCT GATACTAAGG 1560 

AAATTACGGA TCAAGATGTT GAAAATGTTG TCAAATCAGC TTTGACAAAG AGTATGACAC 1620 

CTGACCGTGA AGTCATTACC TTTATTCCTG AAGAATTTAT TGTGGATGGT TTCCAAGGGA 1680 

TTCGTGACCC ACGTGGCATG ATGGGGGTTC GCCTTGAAAT GCGTGGTTTG CTTTATACAG 1740 

GACCTCGTAC TATCTTGCAC AATTTGCGTA AGACGGTTGA GCGTGCAGGT GTTCAGGTTC 1800 

AAAATGTTAT CATTTCACCA CTAGCAATGG TTCAGTCTGT TTTGAACGAA GGGGAACGTG 1860 

AATTTGGTGC TACAGTGATT GATATGGGGG CAGGTCAAAC GACTGTCGCT ACAATCCGTA 1920 

ATCAAGAACT CCAGTTCACA CATATTCTCC AAGAAGGTGG AGATTATGTA ACTAAAGATA 1980 
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TCTCCAAGGT TTTGAAAACC TCTCGCAAAT TAGCGGAAGG CTTGAAACTG AATTACGGGG 2040 

AAGCCTATCC GCCTCTTGCA AGCAAAGAAA CCTTCCAAGT AGAGGTTATT GGAGAAGTAG 2100 

AAGCAGTCGA AGTGACGGAA GCCTACTTGT CAGAAATTAT TTCTGCACGA ATCAAGCACA 2160 

TCCTTGAACA AATCAAGCAA GAATTAGATA GAAGGCGTCT ATTGGACCTC CCTGGTGGTA 2220 

TTGTCTTAAT CGGTGGGAAT GCCATTTTAC CAGGTATGGT TGAGCTTGCT CAGGAAGTCT 2280 

TTGGCGTCCG TGTCAAGCTT TATGTTCCAA ATCAAGTTGG TATCCGTAAT CCAGCCTTTG 2340 

CGCATGTGAT TAGTTTATCA GAATTTGCGG GTCAATTAAC AGAAGTTAAT CTTTTGGCTC 2 400 

AGGGAGCGAT AAAAGGTGAG AATGACTTAA GTCATCAGCC AATTAGTTTT GGTGGGATGC 24 60 

TGCAAAAAAC AGCTCAGTTT GTACAATCAA CGCCTGTTCA ACCAGCTCCT GCTCCAGAAG 2520 

TAGAGCCGGT GGCGCCTACA GAACCAATGG CGGATTTCCA ACAAGCTTCA CAAAATAAAC 2580 

CGAAATTAGC AGATCGTTTC CGTGGATTGA TCGGAAGCAT GTTTGACGAA TAAAGAGGAA 2640 

AAATAAATTA TGACATTTTC ATTTGATACA GCTGCTGCTC AAGGGGCAGT GATTAAAGTA 2700 

ATTGGTGTCG GTGGAGGTGG TGGCAATGCC ATCAACCGTA TGGTCGACGA AGGTGTTACA 2760 

GGCGTAGAAT TTATCGCAGC AAACACAGAT GTACAAGCAT TGAGTAGTAC AAAAGCTGAG 2820 

ACTGTTATTC AGTTGGGACC TAAATTGACT CGTGGTTTGG GTGCAGGAGG TCAACCTGAG 2880 

GTTGGTCGTA AAGCCGCTGA AGAAAGCGAA GAAACACTGA CGGAAGCTAT TAGTGGTGCC 2940 

GATATGGTCT TCATCACTGC TGGTATGGGA GGAGGCTCTG GAACTGGAGC TGCTCCTGTT 3000 

ATTGCTCGTA TCGCCAAAGA TTTAGGTGCG CTTACAGTTG GTGTTGTAAC ACCTCCCTTT 3060 

GGTTTTGAAG GAAGTAAGCG TGGACAATTT GCTGTAGAAG GAATCAATCA ACTTCGTGAG 3120 

CATGTAGACA CTCTATTGAT TATCTCAAAC AACAATTTGC TTGAAATTGT TGATAAGAAA 3180 

ACACCGCTTT TGGAGGCTCT TAGCGAAGCG GATAACGTTC TTCGTCAAGG TGTTCAAGGG 3240 

ATTACCGATT TGATTACCAA TCCAGGATTG ATTAACCTTG ACTTTGCCGA TGTGAAAACG 3300 

GTAATGGCAA ACAAAGGGAA TGCTCTTATG GGTATTGGTA TCGGTAGTGG AGAAGAACCT 33 60 

GTGGTAGAAG CGGCACGTAA GGCAATCTAT TCACCACTTC TTGAAACAAC TATTGACGGT 3420 

GCTGAGGATG TTATCGTCAA CGTTACTGGT GGTCTTGACT TAACCTTGAT TGAGGCAGAA 3480 

GAGGCTTCAC AAATTGTGAA CCAGGCAGCA GGTCAAGGAG TGAACATCTG GCTCGGTACT 3540 

TCAATTGATG AAAGTATGCG TGATGAAATT CGTGTAACAG TTGTTGCAAC GGGTGTTCGT 3600 

CAAGACCGCG TAGAAAAGGT TGTGGCTCCA CAAGCTAGAT CTGCTACTAA CTACCGTGAG 3 660 

ACAGTGAAAC CAGCTCATTC ACATGGCTTT GATCGTCATT TTGATATGGC AGAAACAGTT 3720 

GAATTGCCAA AACAAAATCC ACGTCGTTTG GAACCAACTC AGGCATCTGC TTTTGGTGAT 3780 
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4 



TGGGATCTTC 


GCCGTGAATC 


GATTGTTCGT 


ACAACAGATT 


CAGTCGTTTC 


TCCAGTCGAG 


3840 


CGCTTTGAAG 


CCCCAATTTC 


ACAAGATGAA 


GATGAATTGG 


ATACACCTCC 


ATTTTTCAAA 


3900 


AATCGTTAAG 


TAAATGAATG 


TAAAAGAAAA 


TACAGAACTT 


GTTTTTCGAG 


AAGTTGCAGA 


3960 


GGCTAGTCTG 


AGTGCTCATC 


GAGAGAGTGG 


TTCGGTCTCT 


GTCATTGCAG 


TTACCAAGTA 


4020 


TCTAGATGTA 


CCGACAGCGG 


AAGCCTTGCT 


TCCGCTAGGT 


GTCCATCATA 


TCGGTGAAAA 


4080 


TCGTGTAGAT 


AAGTTTCTGG 


AAAAATATGA 


AGCTTTAAAA 


GATCGAGATG 


TGACTTGGCA 


4140 


TTTGATTGGT 


ACCTTGCAAA 


GACGTAAGGT 


GAAAGATGTC 


ATTCAATACG 


TTGATTATTT 


4200 


CCATGCATTG 


GACTCAGTAA 


AGCTAGCAGG 


GGAAATTCAA 


AAAAGAAGTG 


ACCGAGTCAT 


4260 


CAAGTGTTTC 


CTTCAAGTAA 


ATATTTCTAA 


AGAAGAAAGC 


AAACACGCTT 


TTTCGAGAGA 


4320 


GGAACTGCTG 


GAAATCTTGC 


CAGAGTTAGC 


C AG ACT AG AT AAGATTGAAT 


ATGTTGGTTT 


4380 


AATGACGATG 


GCACCTTTTG 


AGGCTAGCAG 


TGAGCAGTTG 


AAAGAGATTT 


TCAAGGCGGC 


4440 


CCAAGATTTA 


CAAAGAGAAA 


TTCAAGAGAA 


ACAAATTCCA 


AATATGCCTA 


TGACCGAGTT 


4500 


AAGTATGGGA 


ATGAGTCGTG 


ATTATAAAGA 


AGCGATTCAA 


TTCGGTTCCA 


CTTTTGTTCG 


4560 


TATAGGTACA 


TCATTTTTTA 


AGTAGGAGAG 


AACCATGTCT 


TTAAAAGATA 


GATTCGATAG 


4620 


ATTTATAGAT 


TATTTTACGG 


AGGATGAGGA 


TTCAAGTCTC 


CCTTATGAAA 


AAAGAGATGA 


4680 


GCCTGTGTTT 


ACTTCAGTAA 


ATTCTTCACA 


GGAACCGGCT 


CTCCCAATGA 


ATCAACCTTC 


4740 


ACAGTCGGCT 


GGCACAAAAG 


AGAACAATAT 


CACCAGACTT 


CATGCAAGAC 


AACAGGAATT 


4800 


GGCAAATCAG 


AGTCAGCGTG 


CAACGGATAA 


GGTCATTATA 


GATGTTCGTT 


ATCCTAGAAA 


4860 


ATATGAGGAT 


GCAACAGAAA 


TTGTTGATTT 


ATTGGCAGGA 


AACGAAAGTA 


TCTTGATTGA 


4920 


TTTTCAGTAT 


ATGACACACC. 


TCCAGGCTCG 


TCGTTGTTTG 


GACTATTTGG 


ATGGAGCTTG 


4980 


TCATGTTTTA 


GCTGGAAATT 


TGAAAAAGGT 


AGCTTCTACC 


ATGTATTTGT 


TGACACCAGT 


5040 


GAACGTTATT 


GTAAATGTTG 


AAGATATCCG 


TTTACCAGAT 


GAAGATCAAC 


AGGGTGAGTT 


5100 


CGGTTTTGAT 


ATGAAGCGAA 


ATAGAGTACG 


ATAATGATTT 


TTTTAATTCG 


TATGATTTAT 


5160 


AATGCAGTGG 


ATATTTACTC 


CCTGATTTTG 


GTAGCCTTCG 


CTGTCATGTC 


TTGGTTTCCA 


5220 


GGTGCCTACG 


AATCCAGTTT 


AGGTCGTTGG 


ATTGTAGCGT 


TGGTGAAACC 


AGTGCTTGCT 


5280 


CCCTTGCAAC 


GCCTGCCTTT 


ACAGATAGCG 


GGTCTTGATT 


TATCTGTTTG 


GGTTGCGATT 


5340 


GTTTTGGTTC 


GATTTTTAGG 


AGAAAACCTA 


GTGCGTTTTC 


TGGCGATGAT 


AGGATGAATA 


5400 


AAGGGATTTA 


TCAGCATTTC 


TCCATAGAAG 


ATCGTCCATT 


TCTTGACAAG 


GGAATGGAAT 


5460 


GGATAAAGAA 


GGTAGAAGAT 


AGCTATGCTC 


CTTTTTTAAC 


TCCTTTTATC 


AATCCTCATC 


5520 
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AGGAGAAGCT ATTAAAGATT TTGGCCAAAA CCTATGGTCT TGCTTGTAGC AGTAGTGGGG 5580 

AATTCGTCTC GAGTGAGTAT GTTCGAGTTT TATTATACCC AGATTATTTC CAACCAGAGT 564 0 

TTTCAGATTT TGAAATATCT CTCCAGGAAA TTGTGTATTC CAATAAATTT GAACATTTAA 5700 

CGCATGCTAA GATTTTAGGG ACAGTCATCA ATCAATTAGG GATTGAACGG AAACTTTTTG 57 60 

GAGATATCCT AGTAGATGAA GAACGGGCGC AGATTATGAT TAATCAGCAG TTTCTTCTTC 5820 

TCTTTCAAGA TGGACTAAAG AAAATTGGTC GTATACCTGT TTCGCTGGAG GAACGTCCTT 5880 

TCACCGAGAA AATAGATAAG CTAGAACAGT ATCGAGAACT GGATTTATCT GTGTCTAGTT 5940 

TTCGATTAGA TGTTCTTTTA TCAAATGTTT TGAAACTATC TAGGAATCAA GCAAACCAGT 6000 

TGATTGAAAA GAAACTTGTC CAAGTAAATT ATCATGTGGT AGACAAATCA GATTACACTG 6060 

TTCAAGTTGG AGACTTGATT AGTGTGAGAA AATTTGGTCG CTTGAGATTA CTTCAAGATA 6120 

AGGGACAAAC GAAAAAAGAG AAGAAAAAAA TAACCGTCCA GTTATTATTA AGTAAGTGAG 6180 

GAATAGAATG CCAATTACAT CATTAGAAAT AAAGGACAAG ACTTTTGGAA CTCGATTCAG 6240 

AGGTTTTGAT CCAGAAGAAG TCGATGAATT TTTAGATATT GTGGTTCGTG ATTACGAAGA 6300 

TCTTGTGCGT GCGAATCATG ATAAAAATTT GCGTATTAAG AGTTTAGAAG AGCGTTTGTC 6360 

TTACTTTGAT GAAATAAAAG ATTCATTGAG CCAGTCTGTA TTCATTGCTC AGGATACAGC 6420 

TGAGAGAGTG AAACAGGCGG CGCATGAACG TTCAAACAAT ATC ATT CATC AAGCAGAGCA 6480 

ACATGCGCAA CGCTTGTTGG AAGAAGCTAA ATATAAGGCA AACGAGATTC TTCGTCAAGC 6540 

AACTGATAAT GCTAAGAAAG TCCCTGTTGA AACAGAAGAA TTGAAGAACA AGAGCCGTGT 6600 

CTTCCACCAA CCTCTCAAAT CTACAATTGA GAGTCAGTTG GCTATTGTTG AATCTTC AG A 6660 

TTGGGAAGAT ATTCTCCGTC CAACAGCTAC TTATCTTCAA ACCAGTGATG AAGCCTTTAA 6720 

AGAAGTGGTT AGCGAAGTAC TTGGAGAACC GATTCCAGCT CCAATTGAAG AAGAACCAAT 6780 

TGATATGACA CGTCAGTTCT CTCAAGCAGA AATGGCAGAA TTACAAGCTC CTATTGAGGT 6840 

AGCCGATAAA GAATTGTCTG AATTTGAAGC TCAGATTAAA CAGGAAGTGG AAGCTCCAAC 6900 

TCCTGTAGTG AGTCCTCAAG TTGAAGAAGA GCCTCTGCTC ATCCAGTTGG CCCAATGTAT 6960 

GAAGAACCAG AAGTAGCTCC AATGCATCCG ATAGGTCCAA CACCAGCTAC AGAAACTGTT 7020 

GATTCAATAC CGGGATTTGA AGCACCGCAA GAATCTGTTA CAATTTTATA AGAAATATTC 7080 

TGAGAACAAT ATCTTATCCT TATATTTCCA GCGAGCAGGA GATGGTGTGA GTCCTGTAAT 7140 

CCCTATTGAT AAGATTATCC TCTCAAAAAC TCAAGTCTGA AGCTAGTAAG ATTTGACGTT 7200 

TCCCACGTTA CGGGATAAGA GGGAGAAAGA CTAAATCTTT TTCCGAATAA AGGTGGTACC 7260 

ACGATTTTCG TCCTTTTTGG AAGTCGTGGT TTTTAATTTG TTATTATTTA TAAAGGAGAT 7320 
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ACCATGAAAC 


TCAAAGACAC 


CCTTAATCTT 


GGGAAAACTG 


AATTCCCAAT 


GCGTGCAGGC 


7380 


CTTCCTACCA 


AAGAGCCAGT 


TTGGCAAAAG 


GAATGGGAAG 


ATGCAAAACT 


TTATCAACGT 


7440 


CCTCAAGAAT 


TGAACCAAGG 


AAAACCTCAT 


TTCACCTTGC 


ATGATGGCCC 


TCCATACGCT 


7500 


AACGGAAATA 


TCCACGTTGG 


ACATGCTATG 


AACAAGATTT 


CAAAAGATAT 


CATTGTTCGT 


7560 


TCTAAGTCTA 


TGTCAGGATT 


TTACGCACCA 


TTTATTCCTG 


GTTGGGATAC 


TCATGGTCTG 


7620 


CCAATCGAGC 


AAGTCTTGTC 


AAAACAAGGT 


GTCAAACGTA 


AAGAAATGGA 


CTTGGTTGAG 


7660 


TACTTGAAAC 


TTTGCCGTGA 


GTACGCTCTT 


TCTCAAGTAG 


ATAAACAACG 


TGAAGATTTT 


7740 


AAACGTTTGG 


GTGTTTCTGG 


TGACTGGGAA 


AATCCATATG 


TGACCTTGAC 


TCCTGACTAT 


7800 


GAAGCAGCTC 


AAATTCGTGT 


ATTTCGTGAG 


ATCGCTAATA 


AGGGTTATAT 


CTACCGTGGT 


7860 


GCTAAGCCAG 


TTTACTGGTC 


ATGGTCATCT 


GAGTCAGCAC 


TTGCTGAAGC 


AGAGATTGAA 


7920 


TACCATGACT 


TGGTTTCAAC 


TTCCCTTTAC 


TATGCCAACA 


AGGTAAAAGA 


TGGCAAAGGA 


79B0 


GTTCTAGATA 


CAGATACTTA 


TATCGTTGTC 


TGGACAACGA 


CTCCATTTAC 


CATCACAGCT 


8040 


TCTCGTGGTT 


TGACGGTTGG 


TGCAGATATT 


GATTACGTTT 


TGGTTCAACC 


TGCTGGTGAA 


8100 


GCTCGTAAGT 


TTGTCGTTGC 


TCCTGAATTA 


TTGACTAGCT 


TGTCTGAGAA 


ATTTGGCTGG 


8160 


GCTGATGTTC 


AAGTTTTGGA 


AACTTACCGT 


GGCCAAGAAC 


TCAACCACAT 


CGTAACAGAA 


8220 


CACCCATGGG 


ATACAGCTGT 


AGAAGAGTTG 


GTAATTCTTG 


GTGACCACGT 


TACGACTGAC 


8280 


TCTGGTACAG 


GTATTGTCCA 


TACAGCCCCT 


GGTTTTGGTG 


AGGACGATTA 


CAATGTTGGT 


8340 


ATTGCTAATA 


ATCTTGAAGT 


CGCACTGACT 


GTTGATGAAC 


GTGGTATCAT 


GATGAAGAAT 


8400 


GCTGGTCCTG 


AATTTGAAGG 


TCAATTCTAT 


GAAAAGGTAG 


TTCCAACTGT 


TATTGAAAAA 


8460 


CTTCGTAACC 


L * ■* *v * * *J *— 


CCAAGAAGAA 


ATCTCTCACT 


CATATCCATT 


TGACTGGCGT 


8520 


ACTAAGAAAC 


CAATCATCTG 


GCGTGCAGTT 


CCACAATGGT 


TTGCCTCAGT 


TTCTAAATTC 


8580 


CGTCAAGAAA 


TCTTGGACGA 


AATTGAAAAA 


GTGAAATTCC 


ACTCAGAATG 


GGGTAAAGTC 


8640 


CGTCTTTACA 


ATATGATCCG 


TGACCGTGGT 


GACTGGCTTA 


TCTCTCGTCA 


ACGTGCTTGG 


8700 


GGTGTTCCAC 


TTCCTATCTT 


CTACGCTGAA 


GATGGTACAG 


CTATCATGGT 


AGCTGAAACT 


8760 


ATTGAACACG 


TAGCTCAACT 


TTTTGAAGAA 


TATGGTTCAA 


GCATTTGGTG 


GGAACGTGAT 


8820 


GCCAAAGACC 


TCTTGCCAGA 


AGGATTTACT 


CATCCAGGTT 


CACCAAACGG 


CGAGTTCAAA 


8880 


AAAGAAACTG 


ATATCATGGA 


CGTTTGGTTT 


GACTCAGGTT 


CATCATGGAA 


TGGAGTGGTG 


8940 


GTAAACCGTC 


CTGAATTGAC 


TTACCCAGCC 


GACCTTTACC 


TAGAAGGTTC 


TGACCAATAC 


9000 


CGTGGTTGGT 


TTAACTCATC 


ACTTATCACA 


TCTGTTGCCA 


ACCATGGCGT 


AGCACCTTAC 


9060 
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AAACAAATCT TGTCACAAGG TTTTGCCCTT GATGGTAAAG GTGAGAAGAT GTCTAAATCT 9120 

CTTGGAAATA CTATTGCTCC AAGCGATGTT GAAAAACAAT TCGGTGCTGA AATCTTGCGT 9180 

CTCTGGGTAA CAAGTGTTGA CTCAAGCAAT GACGTGCGTA TCTCTATGGA TATCTTGAGC 9240 

CAAGTTTCTG AAACTTACCG TAAGATTCGT AACACTCTTC GTTTCTTGAT TGCCAATACA 9 300 

TCTGACTTTA ACCCAGCTCA AGATACAGTC GCTTACGATG AGCTTCGTTC AGTTGATAAG 93 60 

TACATGACGA TTCGCTTTAA CCAGCTTGTC AAGACCATTC GTGATGCCTA TGCAGACTTT 94 20 

GAATTCTTGA CGATCTACAA GGCCTTGGTG AACTTTATCA ACGTTGACTT GTCAGCCTTC 9480 

TACCTTGATT TTGCCAAAGA TGTTGTTTAC ATTGAAGGTG CCAAATCACT GGAACGCCGT 9540 

CAAATGCAGA CTGTCTTCTA TGACATTCTT GTCAAAATCA CCAAACTCTT GACACCAATC 9600 

CTTCCTCACA CTGCGGAAGA AATCTGGTCA TATCTTGAGT TTGAAACAGA AGACTTCGTC 9660 

CAATTGTCAG AATTACCAGA AGTTCAAACT TTTGCTAACC AAGAAGAAAT CTTGGATACA 9720 

TGGGCAGCCT TCATGGACTT TCGTGGACAA GCACAAAAAG CCTTGGAAGA AGCTCGTAAT 9780 

GCAAAAGTTA TCGGTAAATC ACTTGAAGCA CACTTGACAG TTTATCCAAA TGAAGTTGTG 9940 

AAAACTCTAC TCGAAGCAGT AAACAGCAAT GTAGCACAAC TTTTCATCGT GTCTGAGTTG 9900 

ACCATCGCAG AAGGACCAGC TCCGGAAGCT GCCCTTAGCT TCGAAGATGT AGCCTTCACA 9960 

GTTGAACGTG CTACTGGTGA AGTATGTGAC CGTTGCCGTC GTATCGACCC AACAACAGCA 10020 

GAACGCAGCT ACCAGGCAGT TATCTGTGAC CACTGTGCAA GCATCGTAGA AGAAAACTTT 10080 

GCGGAAGCAG TCGCAGAAGG ATTTGAAGAG AAATAAGATT GAAAAGTCTA GGCAAAATTC 10140 

AATTTGAGAA GAAAAGACAA CTAATTTTAT AGTCTATTAA ACGCATTGTA TCACGTTTTT 10200 

GAATACCTGA TATGATGCGT TTTTTATTTA TTTTAAAAAT TTGCGAGGTA TGACTTTTTA 102 60 

TACTCAACAA GAATCAAAGA GAAACTTAGC AAGCTAACAG TAGTAAGATA AAATAGGAAT 10320 

TTGATATTAG GGATAAGATT GGTAAATAGT GTAATATTTT TACAACAATA AATTTATATA 10380 

GTTATTTCTG GTTTCTGAAA AGTATTATAT TTTATTTCAT ATTATACAAA TTTTTATTTT 10440 

ATAATATCAG AACATACTTT TTTTAAAAGC AAATATGATA CAATTTTATT TGAAAAAAAT 10500 

AAAAAAGGAG ATTTTATTAT AAAATTAAAA AGACTTGCTT TAATTAGTGG TATCGTCGGT 10560 

CTTGTGGGAG GAATTTTACT TCTTATTGGT CCTTTTGTCT TGTTGGGAAT AGCGGTAAAC 10620 

ACAGCTGCTA CAACTCTTAA TGGAGGAGCT ACTGCAGGGG CTTTTTCAGG TGTAGCCTTA 10680 

CTCTTGAATG CCTTGAAGAT TGCAAATCTT GTTCTTGGTA TCATTGCTAT TGTTTACTAT 10740 

AAAGGAGATA AGCGTGTAGG TGCAGCTCCG TCTGTACTAA TGATTGTTTC TGGTGGAGTT 10800 

AGTCTCATTC TATTCCGTTC TTAGGATGGG TTGGGGGGAT TTTTGCTATT ATCGGAGGAT 10860 
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CTCTATTCCT TTCAACATTG AAGAAATTCA AATCAGAAGA ATAAAAGGTA TTTTAGCATG 10920 

AAAAGAACAA AAAAGTTTAT CGGTATAGCA GTAGCTCTAT TATCTCTTTC TCTTCTAGTT 10980 

GCATGTGGAA CATAAAGTTC AAAGAATACT TCAACAAGTA ATGATGAGAA GACAGTAGCA 11040 

* ACATCCAATA GTTCAAAAGA AACAATCACT TTCGATACAC CGGTTGTAAC AGACGATGCG 11100 

ATTGAATCAA TACGCACTTA TGCAGATTAT ATAGATCTTT ATAAAAATAT TTTTGATGAT 11160 

TATTTTACTA AAGCTGAGGA AGGTTTCAAA GGCATAGCTA TGGAAAATAA TGACTCGTTT 11220 

ACTAAACTAA AAGAGTCAAC TCAAAAATTA TTCGATGCGC AGAAAAAAAG CTTAAATAAT 11280 

GAAGATAGAA TAGAAACAAC CAAAAACAAT GTGATTGCCA AACATTGTCA AACAGTCCTT 11340 

TCCTTTTTGG TTTTGACTAG CTTTTTTGTG AAAAATTGTG TAAAATAGAA TAGATAAACG 11400 

AGGGGAAACC TCGGAAAATT TAAAGGAGAA TCCATCTAAT GGTAAAATTG GTTTTTGCTC 11460 

GCCACGGTGA GTCTGAATGG AACAAAGCTA ACCTTTTCAC TGGTTGGGCT GATGTTGATT 11520 

TGTCTGAAAA AGGTACACAA CAAGCGATTG ACGCTGGTAA ATTGATCAAA CAAGCTGGTA 11S80 

TCGAATTTGA CCAAGCTTAC ACTTCAGTAT TGAAACGTGC TATCAAAACA ACTAACTTGG 11640 

CTCTTGAAGC TTCTGACCAA TTGTGGGTTC CAGTTGAAAA ATCATGGCGC TTGAACGAAC 11700 

GTCACTACGG TGGTTTGACT GGTAAAAACA AAGCTGAAGC TGCTGAACAA TTTGGTGATG 11760 

AGCAAGTTCA CATCTGGCGT CGTTCATACG ATGTATTGCC TCCAAACATG GACCGTGATG 11820 

ATGAGCACTC AGCTCACACA GACCGTCGTT ACGCTTCACT TGACGACTCA GTTATCCCAG 11880 

ATGCTGAAAA CTTGAAAGTG ACTTTGGAAC GTGCTCTTCC ATTCTGGGAA CATAAAATCG 11940 

CTCCAGCTCT TAAAGATGGT AAAAACGTAT TCGTAGGAGC TCACGGTAAC TCAATCCGTG 12000 

CCCTTGTAAA ACACATCAAA GGTTTGTCAG ATGACGAGAT CATSGACC7G GAAATCCCTA 12060 

ACTTCCCACC ATTGGTATTC GAATTCGACG AAAAATTGAA CGTCGTTTCT GAATACTACC 12120 

TTGGAAAATA AAAAATTGTA AGTCTAGAAT TGATTTCTAG GCTTTTTATG TTAGTATGGA 12180 

AGTATGATAA GGAATAAAAA ACAAGATTAT GTACTGGCCT ACAAGCAACC AGCTTCAACC 12240 

ACTTACATGG GTTGCGAAGA AGAAGCTTTA CCGATAGGCA ATGGTTCTTT AGGAGCAAAA 12300 

GTATTTGGCC TTATAGGGGC TGAACGGATT CAATTTAATG AAAAAAGTCT CTGCTCTGGA 12360 

GGTCCACTTC CTGATAGTTC AGATTATCAG GGTGGAAATC TTCAGGATCA GTATGTTTTT 12420 

TTAGCTGAGA TTCGGCAGGC TTTGGAGAAG AGAGATTACA ATCTGGCTAA GGAACTGGCT 12480 

GAGCAGCACC TAATTGGGCC AAAAACGAGT CAATATGGGA CCTATCTGTC TTTTGGGGAT 12540 

ATTCACATTG AGTTCAGCCA GCAAGGTACG ACTTTGTCTC AGGTGACGGA CTATCAGAGA 12 600 
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CAGCTGAATA TTAGTAAGGC ACTTGCGACG ACTTCTTATG TCTATAAGGG AACGCGATTT 12660 

GAACGTAAAG CTTTTGCGAG TTTTCCAGAT GATCTCTTGG TTCAATGTTT TACTAAGGAA 12720 

GGGTTGGAAA CTCTAGATTT TACTATAGAA CTATCCTTGA CCTGTGATTT GGCTTCTGAT 12780 

GGAAAGTATG AGCAGGAAAA ATCTGATTAC AAGGAGTGTA AGTTGGATAT TACTGATTCT 12840 

CATATCTTGA TGAAGGGAAG AGTTAAGGAT AATGATCTGC GGTTTGCTAG TTATCTAGCT 12900 

TGGGAAACCG ATGGAGATAT TAGAGTTTGG TCAGATAGGG TTCAGATATC AGGAGCCAGT 12960 

TATGCCAATC TCTTCTTGGC CGCTAAGACG CATTTTGCCC AAAATCCTGC TAGCAATTAT 13020 

CGCAAGAAAC TAGATTTAGA GCAACAGGTG ATAGACTTGG TGGACACAGC TAAAGAAAAG 13080 

GGCTATACCC AATTGAAATC AAGGCATATC GAGGACTACC AAGCCTTATT CCAGCGTGTT 13140 

CAATTGGATT TGGAAGCTGA TGTTGACGCA TCCACTACAG ATGATTTGTT AAAAAATTAT 13200 

AAGCCACAAG AAGGGCAGGC TTTGGAGGAG CTGTTCTTCC AGTATGGACG GTATTTATTG 13260 

ATTAGTTCGT CCAGAGACTG CCCAGATGCT CTACCAGCTA ACCTACAGGG AGTCTGGAAT 13320 

GCGGTCGACA ATCCTCCTTG GAATTCGGAC TATCACTTAA ATGTCAATCT GCAGCTGAAT 13380 

TATTGGCCAG CCTATGTTAC CAATCTCCTA GAGACGGTCT TTCCAGTCAT CAACTATGTA 13440 

GATGATTTGC GTCTCTATGG TCGTCTAGCG GCTGTAAAGT ATGCAGGAAT CGTCTCTCAG 13500 

AAAGGTGAGG AGAATGGTTG GTTGGTTCAT ACTCAAGCGA CTCCCTTTGG TTGGACGGCA 13560 

CCTGCTTCGG ATTACTATTG GGGTTGGTCA CCAGCTGCCA ATGCGTGGAT GATGCAAACC 13620 

GTTTATGAAG CCTATTTATT TTATAGGGAC CAAGACTATC TCAGGGAGAA AATTTATCCC 13 680 

ATGTTGAGGG AAACGGTTCC TTTTTGGAAT GCCTTTTTAC ATAAGGATCA GCAGCCGCAG 13740 

CGTTGGGTGT CTTCTCCGTC TTATTCCCCA GAACATGGGC CGATTTCGAT TGGCAATACC 13800 

TATGACCAAT CTCTGATTTG GCAGTTATTT CATGATTTTA TTCAGGCTCC TCAGCAATTC 13860 

GGACTGGATG AGGACTTGTT GACTGAGGTT AAGGAGAAGT CTGATTTACT AAATCCTTTG 13920 

CAAATCACTC AATCTGGTCG AATCAGGGAG TGGTATGAGG AGGAAGAGCA GTATTTTCAA 13980 

AATGAGAAAG TGGAGGCCCA GCATCGGCAC GCTTCCCATC TAGTGGGACT CTATCCTGGC 14040 

AATCTCTTTA GCTACAAGGG ACAAGAGTAT ATTGAAGCGG CGCGTGCTAG CCTCAATGAT 14100 

CGTGGAGATG GCGGCACAGG CTGGTCCAAG GCTAATAAGA TCAATCTCTG GGCGCGTTTG 14160 

GGAGATGGCA ATCGAGCCCA TAAATTATTG GCAGAGCAGT TAAAGACATC CACCTTGCAA 14220 

AATCTTTGGT GTAGCCATCC TCCTTTTCAG ATAGATGGTA ATTTTGGTGC TACTAGTGGC 142 80 

ATGGCAGAAA TGTTACTCCA GTCTCATGCA GCTTATCTGG TACCTCTAGC TGCCCTACCT 14340 

GATGCTTGGT CAACAGGTTC TCTTTCAGGC TTAATGGCAC GTGGACATTT TGAAGTGAGC 14400 
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ATGAGCTGCG AAGATAAAAA ACTCTTACAG TTGACCATTT TATCAAGGAG TGGAGGAGAT 14460 

TTGCGAGTTT CTTATCCAGA TATTGAGAAG AGTGTGATTA AAATGAATCA AGAAAAAATA 14 520 

AAAGCGAAAT GCATGGGGAA AGATTGTATT TCGGTGGCAA CAGCAGAAGG TGATCTTGTT 14580 

CAATTTTATT TTTAAGAAGA TGTTATAAGG CAGTAATTTG AAACTGCCTT TTAATAAGGA 14640 

TTTAAGAATA TAAGCAGTTT TCAACTAGTT GAAAAAACGT TATAATGATA ATAGGAAGTA 14700 

ATACTCAATG AAAATCAAAG AGCACAAACT AGGAAGCTAG CCGCAGGTTG CTCAAAACAG 14760 

TGTTTTGAGG TTGCAGATGG AAGCTGACGT GGTTTGAAGA GAGATTTTCG AGGAGTATAA 14 820 

TTTGTTTGAT AGAGGGTGGG TCTGATGGCT TATATTGAGA TGAAACACTG TTACAAGCGT 14880 

TATCAGGTTG GGGACACGGA GATTGTGGCC AATTGTGATG TGAATTTTGA GATTGAAAAG 14940 

GGGGAGCTGG TTATTATCCT TGGTGCTTCA GGTGCAGGCA AGTCAACAGT TCTTAACCTT 15000 

CTTGGGGGAA TGGATACCAA TGATGAAGGG GAAATCTGGA TTGATGGTGT TAATATTGCG 15060 

GATTATAGTT CCCACCAGCG CACCAATTAC CGTAGAAATG ATGTGGGGTT TGTTTTTCAG 15120 

TTTTATAATC TAGTTTCTAA TCTGACAGCT AAGGAAAATG TGGAACTGCC TTCTGAAATT 15180 

GTGACAGATG CCTTGAATCC TGATCAGGCC TTGACAGATG TAGGTCTGGC TCATCGTCTC 15240 

AATAACTTTC CAGCCCAGCT TTCTGGAGGG GAGCAACAGC GAGTCTCCAT TGCACGCGCG 15300 

GTAGCCAAAA ATCCTAAAAT TCTCCTTTGT GATGAACCGA CTGGAGCCTT GGATTATCAG 15360 

ACGGGCAAGC AGGTTTTGAA AATTCTCCAA GACATGTCTC GTCAAAAGGG AGCGACGGTC 15420 

ATCATCGTGA CTCATAATGG AGCTTTGGCG CCCATTGCTG ATCGCGTGAT TCAAATGCAC 15480 

GATGCCAGTG TCAAGGATGT GGTGCTCAAC CAGCATCCTC AGGATATTGA CAGTTTGGAC 15540 

TACTAGCATG ATCAAGCGAA AAACTTATTG GAAGGACTTA GTTCAGTCCT TCACAGGCTC 15600 

CAAGGGGCGT TTTTTATCCA TCTTGATCCT GATGATGTTG GGATCTCTAG CCTTAGTAGG 15660 

CCTCAAAGTA ACCAGTCCCA ACATGGAGGC GACAGCTAAT GCTTATTTAA CAACTGCTCA 15720 

AACCTTGGAT TTGGCAGTCA TGTCTAACTA TGGCTTGGAT CAAGCAGACC AAGAAGAACT 157 80 

AAAACAGACG GAGGGCGCAG AGGTCGAGTT TGGCTATTTG ACAGATGTGA CTATGGATAA 15840 

TGGGCAGGAT GCCATTCGGC TGTACTCCAA ACCAGAGCGA ATTTCAACCT TTCAGCTAAG 15900 

AAAGGGACGA CTTCCTCAGT CAGACAAGGA AATCGCTTTG GCCACTCATT TGCAAGGCCA 15960 

ATACAGCGTG GGACAGGAGA TTAGTTTTAA AGAAAAAGAA GAGGGTCATT CCTCTTTAAA 16020 

AGACCATACT TATACCATTA CTGGTTTTGT GGATTCGGCT GAAATCCTCT CCCAGCGAGA 16080 

TATGGCCTAC GCAGGAAGTG GAAGTGGGAC TCTGACAGCC TATCGGGTGA TTTTACCTAG 16140 
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TCAATTTGAT 


CAGAAAGTCT 


ACAATATAGC 


TCGTTTGAAA 


TATCAAGATT 


TAGCGGGTTT 


16200 


AAATGCCTTT 


TCATCAGCTT 


ATGAAGAAAA 


ATCCAAGCAA 


CATCAAGAAG 


AGCTTGAACA 


16260 


AATTTTATCA 


GATAATGGCA 


AGGTACGTCT 


GCAACTTTTG 


AAAAAAGAAG 


GACAAGAGTC 


16320 


TCTAGACAAG 


GGGCAAGAGA 


CCCTTGACAA 


GGCTCAGACT 


AATTTGCAGG 


AAGGCAAGCG 


16380 


TCGTTTAGCA 


GCTGCTCAAG 


CTCGTATACA 


GGCTCAAGAA 


AGTCAACTAG 


CCTTGTTTCC 


16440 


TCAAGTTCAG 


AGAGAGCAGG 


CTAGTGCTCA 


ACTTACCCAA 


GCCAAGCAGG 


AATTGGGCAA 


16500 


GGAAGAGGAC 


AAACTAAAGC 


AAGCTGAACA 


AAATCTAGCC 


CAAGAAAAGG 


AAAAATTAGA 


16560 


AAAACATCAG 


CAAGTCTTGG 


ATGATTTGGC 


GGAGCCAAGG 


TATCAGGTTT 


ATAATCGTCA 


16620 


GACCATGCCA 


GGTGGTCAGG 


GCTATCTTAT 


GTATAGCAAT 


GCTTCATCCA 


GTATTCGAGC 


16680 


AGTGGGCAAT 


ATCTTTCCTG 


TGGTACTTTA 


TGCCGTAGCA 


GCCATGGTGA 


CCTTTACGAC 


16740 


CATGACTCGC 


TTTGTAGACG 


AAGAGCGAAC 


TCATGCAGGG 


ATTTTTAAGG 


CCTTGGGTTA 


16800 


TCGTAGTAAG 


GATATTATCG 


CCAAGTTTCT 


CCTTTATGGA 


CTAGTAGCTG 


GGACTGTCGG 


16860 


AACGGCTCTA 


GGTAGTATAC 


TTGGTCATTA 


TTTGCTAGCC 


AGTGTAATTT 


CAAGTGTCAT 


16920 


TACAAAAGGC 


ATGGTGGTGG 


GAGAAACTCA 


GATTCAGTTC 


TATTGGACCT 


ATAGCTTACT 


16980 


AGCTTTTGTC 


TTGAGCTTGT 


TGGCGAGTGT 


GTTACCAGCC 


TATCTGGTGG 


CTTGGAGGGA 


17040 


ACTTCATGAC 


GAAGCAGCCC 


AGCTTCTACT 


TCCTAAACCT 


CCTGTCAAAG 


GAGCTAAAAT 


17100 


CTTATTGGAG 


CGTATCGGTT 


TTATCTGGCG 


TCGTCTCAGT 


TTTACTCATA 


AGGTAACAGC 


17160 


CCGCAACATC 


TTTCGTTATA 


AGCAGAGAAT 


GTTGATGACA 


ATCTTTGGTG 


TGGCAGGTTC 


17220 


TGTAGCTCTG 


CTCTTTGCAG 


GTTTGGGAAT 


CCAATCTTCT 


CTAGCAGGAG 


TTCCGTCTAA 


17280 


ACAGTTTCAA 


CAAATCCAAC 


AGTATCAGAT 


GCTTGTCTCT 


GAAAATCCTA 


GTGCGACCAA 


17340 


TCAGGACAAG 


GTAGAGCTAG 


CAGAAGTGTT 


GAAAGGGCAG 


GAG AT ACT AG 


CCTACCAGAA 


17400 


AATCTATTCT 


AAAGCGCTAT 


ACAAGGATTT 


CAAAGGCAAA 


GCTGGTCTTC 


AAAACATTAC 


17460 


TCTTATGATG 


ATAGAGAAGG 


AAGAlI ItJftC 


I 1 lnl\- 


C tTPTTTA Af* 
t\. 1 1 LrtnV. 


ATC ATCAGC A 


17520 


GGAGCTGACA 


TTAAAACATG 


GCATCGTTAT 


TACAGCTAAA 


CTCGCCCAGC 


TGGCAGGTGT 


: 7580 


CAAGGTTGGG 


CAGACTTTAG 


AAATTGAAGG 


TAAGGAACTA 


AAGGTCGTTG 


CTATTACTGA 


17640 


GAACTACGTT 


GGTCACTTTA 


TTTATATGAG 


TCAGGCTAGC 


TATGAGCAAC 


TTTACGGACA 


17700 


GCTACCCCAA 


GCCAACACTT 


ATCTGGTCTC 


ATTAAGGGAT 


ACCAGTGCAA 


CTAGTATCGA 


17760 


AAGTCAGGCG 


GGCTTGCTTA 


TGAATCAATC 


TGCGGTGTCC 


AGCGTTGTCC 


AAAATGCTTC 


17820 


AGCCATTCGA 


, CTCTTCGACT 


CTATCGCTAG 


CTCACTCAAT 


CAGACCATGA 


. CCATCTTGGT 


17880 


CATCGTATCG 


I GTTCTATTAG 


I CTATTGTCAT 


1 CCTTTACAAT 


CTGACCAATA 


> TCAACGTAGC 


17940 
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TGAGAGAATC 


CGTGAACTCT 


CCACTATCAA 


GGTTCTTGGT 


TTTCATAATA 


ATGAAGTCAC 


18000 


CCTCTACATT 


TACCGTGAGA 


CGATTGTGCT 


GTCCCTTGTG 


GGAATCGTAC 


TTGGTCTGAT 


18060 


AGCTGGTTTC 


TATTTACACC 


AATTTTTGAT 


TCAAATGATT 


TCGCCTGCGA 


CTATTCTCTT 


18120 


TTATCCGCAG 


GTAGGCTGGG 


AAGTCTATGT 


AATCCCAGTG 


GCAGCAGTAA 


GCATCATTTT 


18180 


GACCTTGCTT 


GGTTTCTTCG 


TCAATTATTA 


TCTGAGAAAG 


GTTGATATGT 


TAGAAGCCCT 


18240 


GAAATCTGTA 


GAGTAAGGTA 


GTTATTTTTA 


GCTGATTGAA 


CTTCTATTTA 


CTAATATTCA 


18300 


AAAATCCTCC 


GTTTCAAAGA 


GCAGGGAACT 


CTTTGTGACA 


GAGGATTTTT 


TCTATAGGGC 


18360 


TTTAGCAGCT 


GCAATTGCGG 


CTTCGAAGTT 


TGGCTCAGAA 


TTGATATTAT 


CCACGTATTC 


18420 


AACGTAGCGA 


ATCGTATTGT 


CAGTATCGAG 


GACAAAGACT 


GCGCGTGCTA 


ATAGGTGCCA 


18480 


TTCGTTGATC 


AAGAGGGCAT 


AATCGCGCCC 


GAAAGAATGG 


TCAAAGTAGT 


CTGAAAGCAT 


18540 


AATGGCATTG 


TCAAGGCCTT 


CAGCACCGCA 


CCAACGTTTT 


TGAGCAAAAG 


GTAGGTCCAT 


18600 


TGAAACAGTC 


AATACGACCG 


TGTTGTCCAG 


TCCAGCCAAT 


TCTTCATTAA 


AACGACGTGT 


18660 


TTGAGTTGAG 


CAGATGCCTG 


TATCGATAGA 


AGGAACGACA 


CTCAAGACTT 


TTTTCTTGCC 


18720 


ATCAAAATCA 


GCCAGAGATT 


TTTTAGAAAG 


ATCTGTTGTA 


GTAAGAGAAA 


AATCAAGCGC 


18780 


CTTGTCGCCG 


ACTTGTAGTT 


GTTTACCTGT 


AAAGCTCACA 


GGATTTCCGA 


GAAAAGTTAC 


18840 


CATAGGATAC 


TCCAATCTTT 


TTTCTTCCAT 


TTTAGCTGAA 


ACACTCGGAA 


TTTTCCAATG 


18900 


ATTTGACCGG 


AAATATGGGC 


ATAGAAAAAA 


CGCCAGCTCA 


TGTGAGAATG 


ACGTTTTTCA 


18960 


TAGGTTTATT 


TTGCCAATCC 


TTCAGCAATC 


TTGTCAAGGT 


TGTATTTCAT 


CATGCTGTAG 


19020 


TAGCTGTCGC 


CTTCTTTACC 


TTGTTCTGCG 


ATAGAGTCAG 


TAAAGATTTG 


AGCGTAGATT 


19080 


GGGATGTTTG 


TGTCTTGAGA 


AACAGTTTTC 


ATTGGACGGT 


CATCCACACT 


TGATTCTACA 


19140 


AAGAGTGATG 


GAACTTTTGT 


TTGGCGAAGT 


TTTTCAACCA 


AGGTCTTGAT 


TTGTTCAGGA 


19200 


GTTCCTTCTT 


CTTCAGTATT 


GATTTCCCAG 


ATGTAAGCAC 


TTGGGACACC 


ATAGGCTTTA 


19260 


GAGAAGTATT 


TGAATGCTCC 


TTCGCTGGTT 


ACAATGACTT 


TCTTTTCAGC 


AGCGATCTTA 


19320 


TTAAATTTAT 


CCTTACTTTC 


TTTATCAAGT 


TTGTCTAACT 


TATCAGTATA 


TTCTTTGAGA 


19380 


TTTTTTTCAT 


AGAATTCTTT 


ATTGTTAGGG 


TCTTTGGCGC 


TCAATTGTTT 


GGCGATATTT 


19440 


TTAGCAAAAA 


TAATACCGTT 


TTCAAGGTTA 


AGCCAAGCGT 


GTGGGTCTTC 


TTTTCCTTTT 


19500 


TCATTTTGAC 


CTTCAAGGTA 


GATAACATCA 


ACGCCGTCGC 


TGACTGCGAA 


GTAGTCTTTG 


19560 


TTTTCAGTTT 


TCTTGGCATT 


TTCTACCAAT 


TTTGTAAACC 


AAGCATTGCC 


ACCTGTTTCA 


19620 


AGGTTGATAC 


CGTTATAGAA 


AATCAAATTA 


GCCTCAGAAG 


TTTTCTTAAC 


GTCTTCAGGA 


19680 
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AGTGGTTCGT ATTCCTGTGG GTCTTGCCCA ATCGGAACGA TACTATGAAC GTCAATTTTG 19740 

TCACCAGCAA TATTTTTAGT AATATCAGCG ATGATTGAGT TTGTAGCAAC AACTTTTAGT 19800 

TTTTGACCAG AAGTTGTATC TTTTTTTCCG CTAGCACATG CTACAAGAAT GATTGCAGAA 19860 

AGAAAGAGAA CGAGTAATGT ACCTAATTTT TTCATTAGAT CCTCCAATTT ATTAGGGCTT 19920 

TGCCCCTTAT TTTAACAAAT GTTTATTTTT CAGTTTCAAA TATCGTTGTT TGGGAGCGAT 19980 

AAAGAAGCTA ATGAGAAAGA AACTAGCAGC TGTAAGCACG ATACTAGAAC CTGCCGCAAC 20040 

ATTAAAACTA TAGCCAATAA AGAGTCCCAA AACTGAAGCA GTAGCTCCGA AGGTTGAGGA 20100 

AAGGAAAATC ATACTTTTCA GACTATTAGC ATACAGATAA GCAGTTGCAG CTGGGGTAAT 20160 

CAGCATGGCT ACAATCAGGA TAGTTCCGAC ACTTTGCATG GCTGTCACAG ACACGAGAGT 20220 

CAGGAGTACC ATGAGAAGGT AGTGATAGAA ATTGACAGGC ATTCCCATGG CTTTAGCCAA 20280 

GAGTTCATCA AAGGAAGTTA TCAAGAGTTG CTTGAAGAAA ATCCAGATTA ACAAGAGGAT 20340 

AGCTGCCCCC ACACCCATAG TAATAAACAT ATCCGTATCT TGGACGGCCA GGATATTACC 20400 

AAAAAGGATA TGGAAAAGGT CAGTTGAACT TTTAGCGACA CCAATCAAGA TGATACCGAG 204 60 

GGCTAAGAAA GAAGAAAAGG TAATGCCGAT GGCGGTATCG CTTTTGATAA TCGAGTTTCC 20520 

TTTGATGTAG GTAATGATGA TGGCAGCTAG CAATCCAAAG ACAATGGCTC CGATAAAGAA 20580 

GTCAAGGCCC AAGATGAAGG ATAGGGCTAC ACCTGGTAAG ACAGCATGTG AAATGGCATC 20640 

TCCCATGAGT GACATCCCGC GTAGAATAAT GAAACATCCC ACAGCTCCAG CTACAATCCC 20700 

GACGACAATA GCTGTTATCA AGGCATTTTG TAGGAAATGG AATTTTTGCA ATCCATCGAT 20760 

AAATTCTGCA ATCATAGGTC ACCTCCATTG AAAAAGAGTT GATTACCGTA AGCTTCTTTT 20820 

AGATTGGTTT CGGTAAAAGT TTCTTTTGTT GGACCAAAGC CAATCACTTC TCGATTGACA 20880 

AGTAAGACTT GATCGAAGTA GTGGGGAATC TTGCTGAGGT CGTGGTGAAC GATGAGAACC 20940 

GTCTTCCCAG CTTTTTTCAA ATCTCTCAGC GTATTCATGA TGATTTCCTC ACTGACAGAG 21000 

TCAATCCCAG CAAAGGGTTC ATCCAAGAGG ATATAGTCGG CTTCCTGCAC CAAAC AT CTG 21060 

GCAATCAAGA CCCGCTGGAA TTGACCTCCA GACAGTTGAC TAATTTGACG TTCAGCGTAG 21120 

TCAGCTAGGC CGACGATTTC AAGGGCCTCT TGCACTTTCT TCCAATGTTT AGCCTTTAAA 21180 

CTTCGAAAGA GAGGAATAGA GGGAAATACT CCTAACGAGA CGCATTCCTT GACCTTGATG 21240 

GGAAAGTTGT AGTCGATATT GATTTTTTGT TCGACATAGG CAATTCGGTG TAAGGATTTT 21300 

TTAACTTCCT TGTCATCGAG AAATGCCTGA CCTTGATGTG GGATAATTCC CAACATACCT 21360 

TTTAATAGTG TTGATTTCCC AGCGCCGTTT GGACCAATGA TGCCGGTAAT TGTTGGTCCA 21420 

TGGAGCACTA GTGAAATATC CTTAAGTGCC AACGTTTCTT TGTAGGAGAC ACTGAGGTTT 214 80 
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TCGATACGTA TCATAAACTT GTATTCCTCC TGTCTCTTAA TATACATTAA AAAAAAAATT 21540 

AAGTCAAGTT AATTTTTGAA AAAATTAAAA TAATAACTCA AAAATAGATT CTAAAGATAA 21600 

CTTTCAGGAT AAATTTCTAA ATTATAAAAC GCATAGTATC AAGTGTAAAA AACTTGGAAT 21660 

TATGCGTTTT ATCATGGAAA GATTTTTTAT AATAGCTAAA AAATAA 21706 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 

(A> LENGTH: 6171 base pairs 
(B> TYPE: nucleic acid 
(C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GATCCCCAGG AAAAACCGAG GTTTTCCCAA TCAATCGTTA CTGTCATATT CCACTCCTTA 60 

TTCTAAAAAC CTATTTCTTA TATTCTACAC TATTTTTCTA AAATAGCAAG TATATTTTGT 120 

AATTTTCAGA AAATTTCTCC AATAAAAACC AACTCTTAGA ACTGATTCTT CATTTCACTT 180 

ATTTATCTTC AGTAACTACT TCCTGAAGAT AAGCGTCAAA AACTTCTTCA TCTGAAATCG 240 

TGTCAGAAAT GAAGCTTCCA TTGCTAGTGC GTTCTGACAA GTTCAAGTCT TGCAATCGGC 300 

TTTCATAGAT TGTTCCTTTA TTGGATTGGA CAAGCAGAGT TTCGTCGTTC ACATCCACTT 3 60 

CCGTACTGAA GAAATCGCCA ACAAATCCTT GCTCTGCAAC TGCTCCTGCC AAGAAGACAC 420 

GATGCGGTTT GTTTTTCAAC TCACGCAAGA CTTGTAATCC TCGTTTGGCA CGGCTGGTTG 480 

CTACAATTTC CTCAATGGAA ACACGTTTCA ACCTTCCACG CTGGGTCAAG ACGTACAAGG 540 

ACGAAGTATT ACAGATAAAC CCAGATTGGA CCACATCATC TTCTT7CAAA TTCATAGCCT 600 

TGACACCTGC TGCCTTAGCA CCGACAACCG GAACCTCTTC GATATTGAAA CGCAGGGCAT 660 

AACCATTTTG ACTAACCAAG ACAACATCAT CTAGTTTAAT CGGAGCCACT CCTACAATCT 720 

GATCTGTATC GTCTTTGAGC TTAGCATACT TGACAGACTT AGATCTATAG GTCCGCCATG 7 80 

GAGTGAATTC TTTTCGCTCT ACCCGTTTGA TTTGACCAAG GCGAGTCACT GCAAAGTAGG 840 

TTGTCGCATC GTCAAACTCA TCCAGTACTT CCACATAAAG GATTTCTTCA TTCGTTTCAA 900 

AGTTTGTGAT GGTTTGGCTC AGATGCTCTC CGATGTCCTT CCAACGAATA TCTGCCAACT 960 

CATGGATTGG TCTGTAGATG ACATTTCCAA GACTTGTGAA CATCAAGAGG TGCTGGGTTG 1020 

TCTTGGCAGA TTGAACAAAA ATCAAACGGT CATCATCACG CTTGCCAATT TCTTCCAAGG 1080 

TGGAAGCCGC AAAGGAACGT GGACTGGTAC GCTTGATGTA ACCTGCCTTG GTCACGCTGA 1140 
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CGTAGGTATC 


TTCCTCAGCG 


ATAAGACTAG ' 


CTGTATCAAT 


CTCAATTGCT 


TTCGCAGTGT 


1200 


CTTCTAAAGA 


ACTCAAACGA 


GGAGTTGCAA 


ATTTCTTCTT 


GACCTCACGA 


AGTTCTTTCT 


1260 


TCATGAGATT 


GTACATAGTC 


CTTTCATCAC 


CGATAATAGC 


CGCCAGCATA 


GCAATCTTCT 


1320 


CACGAAGCTC 


TGCTTCTTCT 


TCCTGCAAGA 


CAACCACATC 


GGTATTGGTC 


AAACGGTACA 


1380 


GTTGCAAAGT 


TACGATAGCC 


TCAGCCTGTT 


CTTCCGTAAA 


ATCATAGCTA ACTTTGAGGT 


1440 


TTTCCTTGGC 


GTCCGCCTTA 


TTCTCAGAAG 


CACGGATAAG 


AGCAATGACT 


TCATCCAAAA 


1500 


TCGAAATCAC 


ACGAATCAAA 


CCTTCGACGA 


TATGGAGACG 


TTTCTCAGCC 


TTTTCTTTGT 


1560 


CAAAGCGTGA 


ACGCGCCAAA 


ATCACTTCTC 


GACGGTGAGC 


GATATAGCTA 


GACAGGATTG 


1620 


GAACAATCCC 


AACCTGACGA 


GGTGTGAAAT 


TGTCAATCGC 


CACCATATTA 


AACTTGTAGT 


1680 


TGATTTGTAG 


GTCGGTGTAC 


TTAAATAAGT 


AGTTGAGAAC 


AAGCTCAGTA 


TTAGCGTCTT 


1740 


TfTTAAGTTC 


GATAGCGATA 


CGAAGACCAT 


CACGGTCAGA 


CTCATCACGA 


ACCTCAGCAA 


1800 


TCCCAGCTAC 


CTTGTTATTA 


ACACGAACAT 


CATCGATTTT 


CTTGACTAGA 


TTGGCCTTAT 


1860 


Tfl ATTT C ATA 


AGGAATCTCA 


ATAATAACGA 


TTTGTTCCTT 


ACCACCTTTT 


AGCTTTTCAA 


1920 


TTTCAGTCTT 


GGAACGAACA 


ACCACGCGCC 


CTTTCCCAGT 


CTCATAAGCT 


TTCTTGATTT 


1980 


CATCACGACC 


CTGAATAATA 


GCCCCTGTAG 


GGAAGTCTGG 


TCCAGGCAAG 


AATTCCATGA 


2040 


GTTTATCAAT 


CTTTGCAGTT 


GGGTGGTCAA 


TCATGTAAAC 


TGCAGCATCT 


ATGACCTCAG 


2100 


CTAAATTATG 


GGGAGGAATG 


TCTGTGGCAT 


AACCAGCCGA 


AATCCCAGTC 


GAACCATTGA 


2160 


CCAAGAGGTT 


TGGAAAGGCT 


GCTGGCAAGA 


CCGTTGGTTC 


TTTCTCCGTA 


TCGTCAAAGT 


2220 


TCCATGCAAA 


AGGAACTGTC 


TTTTTCTCGA 


TATCCTGAAG 


AAGGTAGCCT 


GCAATTTCAG 


2280 


ACAAACGTGC 


CTCAGTATAA 


CGCATAGCCG 


CAGGAGGATC 


TCCGTCCATA 


GAACCGTTAT 


2340 


TACCGTGCAT 


TTCAACTAGA 


ATCTCACCAT 


T7TTCCAGTT 


CTGTGACATA 


CGAACCATGG 


2400 


CATCATAGAT 


AGAAGAATCC 


CCGTGTGGGT 


GGAAATTCCC 


CATGATGTTC 


CCGACTGACT 


2460 


TGGCCGACTT 


ACGGTAGCTC 


TTGTCAAAAG 


TATTGCTATC 


CTTATTCATA 


GAATAAAGAA 


2520 


TACGGCGCTG 


AACCGGCTTC 


AACCCATCAC 


GAATATCTGG 


CAAAGCCCGG 


TCTTGAATAA 


.2580 


TGTACTTGGA 


GTAGCGACCA 


AACCGCTCTC 


CCATGATGTC 


CTCCAGGGAC 


ATGTTTTGAA 


2640 


TGTTAGACAT 


AAGATACAAA 


GCCCATAAAA 


TACCAAGTGA 


AAATAGAAAA 


TTCTTGAAGT 


2700 


AAGCAAACTC 


ACAAGAGAAT 


TTATCTTTTT 


CACACAGTAT 


CTAGGGCGTG 


TTCAACTCCT 


2760 


TTCAAAGAAT 


GTAGAGTAGG 


TTTTTATGCA 


. GTAAAAGATA 


TTTTACGGGA 


ATTCCTCCCG 


2820 


TGTTCAGTTA 


. CGATAAGTAA 


CCAAACTATC 


CTGTTTGTAT 


TTTTCAATAT 


GAAAATCTGG 


2880 


TTTTCCAAAA 


, TTAGTCTTAG 


TTTGTGTCTT 


1 AGCCGCTCCC 


TTAAGCGCCT 


1 CTTTGAGATA 


2940 
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AGCACTCATA GCAGATTCTT CATTAATAAT CCTGCAATTT TTTCAAACCA AGATTTTCAA 3000 

ACTGCTTTTT CACATAGTCA TTCACATCCG ACTCTAATTT CCAGTTTACT AACATATTAT 3060 

TTTCTTTCAT TAAAACACTG TCGTTTCTTC TAGCGTAAAC TTGACATTAT CTTCAATCCA 3120 

TTTACGGCGT GGTTCTACCT TATCTCCCAT GAGAACATTG ACGCGGCGTT CGGCGCGCGC 3180 

TAAATCTTCA ATTGTGACAC GGATGAGGGT ACGTGTTTCT GGGTTCATGG TTGTTTCCCA 3240 

GAGCTGGTCC GCATTCATCT CACCAAGTCC TTTGTATCGT TGGAGGGTAG CGCCTTTACC 3300 

GAACTGTTTA CGGAGTTCTT CTAGTTCTCC GTCCGTCCAA GCGTAGGCCA CTTCTTCTTT 3360 

CTTGCCTTTA CCTTTGGACA TCTTGTAAAG AGGTGGGAGG GCAATATAGA CATGACCTGC 3420 

CTCGACTAGC GGACGCATGT AACGGTAGAA AAATGTCAAG AGCAAGGTCT GGATATGGGC 3480 

ACCGTCGGTA TCCGCATCGG TCATGATAAT GATCTTATCA TAGTTGGCAT CTTCAATAGA 3540 

GAAGTCTGCT CCAACACCCG CACCAATGGT ATAAATCATG GTATTGATCT CTTCATTTTT 3600 

GAGGATATCC GCCATCTTGG CCTTGGCTGT ATTGACAACC TTACCACGAA GAGGTAGAAT 3660 

AGCCTGGAAC TTGCGGTCAC GACCTTGTTT GGCAGAACCA CCGGCAGAGT CCCCC7CAAC 3720 

TAGATAGAGT TCATTCTTAG CAGGATTCTT AGATTGGGCT GGGGTCAATT TCCCAGACAA 3780 

CAAGCCCTTA TCTTTCTTGT TTTTCTTCCC ATTTCGGCTC TCATCACGCC CCTTACGTGC 3840 

TGCTTCACGA GCATCACGGG CCTTGATAGC CTTGCGGATG AGGTTAGAAG CTAATTCCCC 3900 

ATTTTCCATA AGGAAAAAGG TCAACTTATC AGCCACTATT CCATCCACAA CTGGGCGAGC 39 60 

TAGGGGGCTT CCTAGTTTAT CCTTGGTCTG TCCTTCAAAC TGCAAGTGTT CTTCAGGAAC 4020 

TAAGATAGAA AGAACGGCCG CTAGTCCCTC ACGATAGTCT GAACCTTCAA GCTTTTTATC 4080 

TTTTTCCTTG AGAAGACCTG TTTTACGTGC ATACTCATTC ATGACCTTGG TAATGGCAGA 4140 

CTTGAGTCCT GTCTCGTGCG TTCCACCGTC CTTGGTGCGA ACGTTATTGA CAAAAGATAG 4200 

AATGTTATCT GAGAATCCGT CATTGTACTG GAGGGCTACT TCCACTTGAA AACCATTGTC 4260 

TTCCCCTTCA AAGTAAAGAA CTGGCGTCAA GATTTCCTTA TCTTCGTTGA GATAAGAAAC 4320 

AAAATCTTGT ACTCCATTCT CATAGTGGAA CTCAATCGCT TCA TTT GTTC GCTTGTCCGT 4380 

TAAAGACAAG GTCACATTTT TCAAGAGAAA GGCTGATTCA TTAAGGCGCT CTGAAATGGT 4440 

ATTGTACTTG AAATCTGTCG TAGAAAATAT AGTCGCGTCA GGCATAAAAG TAACTTTGGT 4500 

GCCTGTTTTA GACTTGGGTG CTGTACCGAT TTTCTTCAAA GTCGTGACAG GTTTTCCACC 4560 

ATTTTCGAAA CGTTGCTTGT AAACTGCGCC ATCACGGGTA ATTTCAACTT CTAACCAGCT 4 620 

AGAAAGGGCG TTAACAACGG AAGAACCCAC TCCGTGAAGT CCACCTGATG TCTTATAGCC 4680 
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ACCTTGACCG AATTTCCCTC CGGCATGAAG AATGGTAAAG ATAACCTCAA CAGTTGGAAT 4740 

TCCCATAGCG TGCATACcTG TCGGCATCCC ACGTCCATGG TCTTGAACCG TTAGACTACC 4800 

GTCTTTATTG ATAGTTACAT CAATACGATC ACCAAACCCA GACAAGGCTT CATCGACTGC 4860 

ATTATCAACG ATTTCCCAAA CTAGGTGATG AAGACCAGCG CCATCGGTCG ATCCAATATA 4 920 

CATCCCTGGA CGTTTTCGGA CCGCATCCAA CCCTTCTAGC ACCTGAATAG CATCATCATT 4980 

ATAATTGTTA ATATTGATTT CCTTTTTTGA CACAAGGAAC CTCCTATTCG TTCATCTTTA 5040 

CTATTCTACA GGTTTTCCAA GGATTTTGCA AAATTTTTCT TTCTCCGATG TGACAATTTC 5100 

AGCAGAGATT CTCTGCTTTT CTTTCCCAAT TCATGATATA ATAGGAGTAT GATTACAATA 5160 

GTTTTATTAA TCCTAGCCTA TCTGCTGGGT TCGATTCCAT CTGGTCTCTG GATTGGACAA 5220 

GTATTCTTTC AAATCAATCT ACGCGAGCAT GGTTCTGGTA ACACTGGAAC GACCAACACC 5280 

TTCCGCATTT TAGGTAAGAA AGCTGGTATG GCAACCTTTG TGATTGACTT TTTCAAAGGA 5340 

ACCCTAGCAA CGCTGCTTCC GATTATTTTT CATCTACAAG GCGTTTCTCC TCTCATCTTT 5400 

GGACTTTTGG CTGTTATCGG CCATACCTTC CCTATCTTTG CAGGATTTAA AGGTGGTAAG 5460 

GCTGTCGCAA CCAGTGCTGG AGTGATTTTC GGATTTGCGC CTATCTTCTG TCTCTACCTT 5520 

GCGATTATCT TCTTTGGAGC TCTCTATCTT GGCAGTATGA TTTCACTGTC TAGTGTCACA 5580 

GCATCGATTG CGGCTGTTAT CGGGGTTCTG CTCTTTCCAC TTTTTGGTTT TATCCTGAGT 5640 

AACTATGACT CTCTCTTCAT CGCTATTATC TTAGCACTTG CTAGTTTGAT TATCATTCGT 5700 

CATAAGGACA ATATAGCTCG TATCAAAAAT AAAACTGAAA ATTTGGTCCC TTGGGGATTG 5760 

AACCTAACCC ATCAAGATCC TAAAAAATAA AATGCCAGTT CTGTACTGCC CCCAAACAGT 5820 

TAGACAAATA ATTTATCCAA AGGATTTAGT TCTGTACTGC ACAGGACTAA GTCCTTTTAG 5880 

TTTTACCTTA ATTCGTTTGT TGTTGTAGTA ATCAATATAG TCTATAATGG CTTGTTCCAA 5940 

TTGATTAAGT GATTTAAATG TTTTCTCATA GCCATAAAAC ATTTCGGATT TTAAAATGCC 6000 

AAAGAAAGAT TCCATCCTAC CGTTGTCTTG GCTGTTGCCC TTACGTGACA TGGATGCTTG 6060 

AATTCCCTTA CTCTCTAGGA ACCGATGATA AGAATCGTGT TGGTATTGCC AGCCTTGGTC 6120 

ACTATGGAGA ATCGTATTCT CGTAGTGCTT CTCTGTGAAT GCCTGTTCCA A 6171 

(2) IKFORMATION FOR SEQ ID NO: 38; 

ti) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18475 base pairs 
<B} TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



TATTACAAAT 


AAAAAAACGG 


AGGAGTGCTT 


TATGAAAGCC 


TATACTTATG 


TTAAACCAGG 


60 


ACTTGCTTCT 


TTTGTTGATG 


TAGACAAACC 


AGTTATTCGC 


AAGCCAACAG 


ACGCTATTGT 


120 


GCGTATTGTA 


AAAACCACTA 


TTTGTGGAAC 


AGACCTCCAT 


ATTATCAAAG 


GGGATGTTCC 


180 


TACTTGCCAA 


AGTGGTACCA 


TTCTTGGCCA 


CGAAGGGATT 


GGGATTGTTG 


AAGAAGTTGG 


240 


GGAAGGAGTT 


TCCAACTTCA 


AAAAAGGTGA 


CAAGGTCTTG 


ATTTCTTGCG 


TCTGTGCCTG 


300 


TGGTAAATGC 


TACTACTGTA 


AAAAAGGAAT 


TTATGCTCAC 


TGTGAAGACG 


AAGGGGGCTG 


360 


GATTTTCGGT 


CACTTGATTG 


ATGGTATGCA 


GGCTGAATAT 


CTACGTGTCC 


CTCATGCAGA 


420 


TAATACTCTT 


TACCATACTC 


CAGAAGACTT 


GTCAGATGAA 


GCTTTGGTTA 


TGCTGTCAGA 


480 


CATTCTGCCT 


ACTGGATATG 


AAATTGGTGT 


CTTAAAAGGG 


AAACTAGAAC 


CTGGTTGCAG 


540 


CGTAGCCATT 


ATTGGTTCAG 


GTCCAGTTGG 


ATTGGCTGCT 


CTTTTAACAG 


CCCAATTCTA 


600 


TTCACCAGCT 


AAATTGATTA 


TGGTAGACCT 


AGACGATAAC 


CGCTTGGAAA 


CTGCCCTATC 


660 


ATTCGGTGCG 


ACTCATAAGG 


TTAATTCTTC 


AGACCCTGAA 


AAAGCCATTA 


AAGAAATTTA 


720 


TGATTTGACA 


GATGGTCGTG 


GTGTGGATGT 


CCCTATCGAA 


GCTGTTGGTA 


TTCCTGCAAC 


780 


ATTTGATTTC 


TGTCAAAAGA 


TTATCGGTGT 


AGACGCAACG 


GTTGCCAACT 


GTGGTGTGCA 


840 


TGGTAAACCA 


GTTGAATTCG 


ATTTAGATAA 


ACTTTGGATT 


CGCAACATCA 


ATGTAACAAC 


900 


TGGTTTGGTA 


TCTACAAATA 


CGACTCCACA 


ATTGTTGAAA 


GCACTTGAAA 


GTCATAAGAT 


960 


TGAACCGGAA 


AAATTGGTAA 


CTCACTATTT 


CAAACTCAGT 


GAAATTGAAA 


AACCCTACGA 


1020 


AGTCTTCAGT 


AAGGCAGCAC 


ACCACCATGC 


CATTAAGGTC 


ATTATCGAAA 


ACGATATCTC 


1080 


AGAAGCCTAA 


GTAGTAAAAA 


TATTTTTGTA 


CATAAGTAAA 


TAGAAATTCA 


GTCATCCATC 


1140 


AGATGGCTGG 


ATTTTTTATC 


AAAAAATTAA 


GAAATGAGCA 


TATTTCTTTC 


CTTGTCTGGC 


1200 


GGAATTGGTT 


ATAATATACG 


GTACAAAGGA 


ATGAATCAAT 


ATGTATCGTG 


TTATAGAAAT 


1260 


GTACGGAGAT 


TTTGAACCGT 


GGTGGTTCTT 


AGAAGGTTGG 


GAAGAAGATA 


TTGTAGCAAG 


1320 


TAGAAAATTT 


GACCAGTATT 


ATGATGCTCT 


CAAATACTAC 


AAAACTTGCT 


GGTTTAGATT 


1380 


GGAACAAGAA 


TCGCCTCTTT 


ATAAAAGTAG 


AAGCGACTTG 


ATGACCATTT 


TTTGGGACCC 


1440 


GGAAGACCAA 


CGCTGGTGTG 


ATGAATGTGA 


TGAGTATTTA 


CAACAATACC 


ATTCTTTGGC 


1500 


TCTTTTGCAG 


GATGAGCAGG 


TTATCCCAGA 


CGAAAAACTA 


CGCTCAGGCT 


ATGAAAAACA 


1560 


AACCAGTCAG 


GAAAGGAATC 


GTTCTTGCCG 


TATGAAATTA 


AAATAGAGAA 


AAGTAACTTT 


1620 


TTTGGAGTTG 


CTTTTTTTAT 


TTTTCTAACT 


CTTTGCGAAT 


AGTATAGGTG 


AGGAGGTAAG 


1680 
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TATGGTTCAA GAAATTGCAC AAGAAATCAT TCGTTCAGCT CGGAAAAAAG GGACGCAGGA 174 0 

TATCTATTTT GTCCCTAAGT TAGACGCCTA TGAGCTTCAT ATGACGGTAG GAGACGAGCG 1800 

CTGTAAAATT GGTAGCTATG ATTTTGAAAA GTTTGCAGCC GTTATCAGTC ACTTTAAGTT 1860 

TGTGGCGGGT ATGAATGTGG GAGAAAAAAG ACGTAGTCAA CTGGGTTCCT GTGATTATGC 1920 

CTATGACCAT AAGATAGCGT CTCTACGTTT ATCTACTGTA GGCGATTATC GGGGGCATGA 1980 

GAGTTTGGTT ATCCGTTTGT TGCACGATGA GGAGCAGGAC CTGCATTTTT GGTTTCAGGA 2040 

TATTGAAGAA TTAGGCAAGC AGTACAGGCA ACGGGGACTC TATCTTTTTG CTGGTCCGGT 2100 

TGGGAGTGGT AAGACGACCT TGATGCATGA ATTGTCCAAG TCACTCTTTA AAGGACAGCA 2160 

AGTTATCTCC ATCGAAGATC CTGTCGAAAT CAAGCAGGAC GACATGCTTC AGTTGCAGTT 2220 

GAACGAAGCA ATCGGCCTAA CCTATGAAAA TCTAATCAAA CTTTCCTTGC GTCATCGACC 2280 

AGATCTCTTG ATTATCGGAG AAATTCGTGA CAGCGAGACG GCGCGTGCAG TGGTCACAGC 234 0 

TAGTTTGACA GGTGCGACAG TCTTTTCAAC CATTCACGCC AAGAGTATCC GAGGTGTTTA 2400 

TGAGCGTCTG CTGGAGTTGG GTGTGAGTGA AGAAGAATTG GCAGTTGTTC TGCAAGGAGT 2460 

CTGCTACCAG AGATTAATCG GGGGAGGAGG AATCGTTGAC TTTGCAAGCA GAGATTATCA 2 520 

AGAACACCAA GCAGCCAAGT GGAATGAGCA AATTGACCAG CTTCTTAAAG ATGGACATAT 2580 

CACAAGTCTT CAGGCTGAGA CGGAAAAAAT TAGCTACAGC TAAGCAAAAA AATATCATCA 2640 

CCCTATTTAA CAATCTCTTT TCTAGCGGTT TTCATCTGGT GGAGACTATC TCCTTTTTAG 2700 

ATAGGAGTGC TTTGTTGGAC AAGCAGTGTG TGACCCAGAT GCGTGTGGCC TTGTCTCAGG 2760 

GGAAATCATT CTCAGAAATG ATGGAAAGTT TGGGATGTTC AAGTGCTATT GTCACTCAGT 2820 

TATCCCTAGC TGAAGTTCAT GGCAATCTCC ACCTGAGTTT GGGAAAGATA GAAGAATATC 2 880 

TGGACAATCT GGCTAAGGTC AAGAAAAAAT TGATTGAAGT AGCGACCTAT CCCTTGATTT 2940 

TGCTGGGTTT TCTTCTCTTA ATTATGCTGG GGCTACGGAA TTACCTGCTC CCACAACTCG 3000 

ATAGTAGCAA TATTGCCACC CAAATTATCG CTAATCTGCC CCAAATTTTT CTAGGCATGG 3060 

TAGGGCTTCT TTCCGTGCTT GCCCTTTTAG CACTCACTTT TTATAAAAGA AGTTCTAAGA 3120 

TGAGTGTCTT TTCTATCTTA GCACGCCTTC CCTTTATTGG AATCTTTGTG CAGACCTACT 3180 

TGACAGCCTA TTATGCACGT GAATGGGGGA ATATGATTTC ACAGGGAATG GAGTTGACGC 3240 

AGATTTTTCA AATGATGCAG GAACAAGGTT CCCAGCTCTT TAAAGAAGTC GGTCAAGATC 3300 

TGGCTCAAAC CCTGAAAAAT GGCCGTGAAT TTTCTCAGAC GATAGGAACC TATCCTTTCT 3360 

TTAGGAAGGA ATTGAGTCTC ATCATAGAGT ATGGGGAAGT TAAGTCCAAG CTGGGTAGTG 3420 

AGTTGGAAAT CTATGCTGAA AAAACTTGGG AAGCCTTTTT TACCCGAGTC AACCGCACCA 3480 
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TGAATTTGGT GCAGCCACTG GTTTTTATCT TTGTGGCACT GATTATCGTT TTACTTTATG 3540 

CGGCAATGCT CATGCCCATG TATCAAAATA TGGAGGTAAA TTTTTAAAAT GAAAAAAATG 3600 

ATGACATTCT TGAAAAAAGC TAAGGTTAAA GCTTTTACAT TGGTGGAGAT GTTGGTGGTC 3660 

TTGCTGATTA TCAGCGTGCT TTTCTTGCTC TTTGTACCTA ATCTGACCAA GCAAAAAGAA 3720 

GCAGTCAATG ACAAAGGAAA AGCAGCTGTT GTTAAGGTGG TGGAAAGCCA GGCAGAACTT 3780 

TATAGCTTAG AAAAGAATGA AGATGCTAGC CTAAGAAAGT TACAAGCAGA TGCACGCATC 3840 

ACGGAAGAAC AGGCTAAAGC TTATAAAGAA TACAATGATA AAAATGGAGG AGCAAATCGT 3900 

AAAGTCAATG ATTAAGGCCT TTACCATGCT GGAAAGTCTC TTGGTTTTGG GACTTGTGAG 39 60 

TATCCTTGCC TTGCGCTTGT CCGGCTCTGT CCAGTCCACT TTTTCAGCGG TAGAGGAACA 4020 

GATTTTCTTT ATGGAGTTTG AAGAACTCTA TCGGGAAACC CAAAAACGCA GTGTAGCCAG 4080 

TCAGCAAAAG ACTAGTCTGA ACTTAGATGG GCAGACGCTT AGCAATCGCA CTCAAAACTT 4140 

GCCAGTCCCT AAAGGAATTC AGGCCCCATC AGGCCAAAGT ATTACATTTG ACCGAGCTGG 4200 

GGGCAATTCG TCCCTGGCTA AGGTTGAATT TCAGACCAGT AAAGGAGCGA TTCGCTATCA 4260 

ATTATATCTA GGAAATGGAA AAATTAAACG CATTAAGGAA ACAAAAAATT AGGGCAGTGA 4320 

TTTTACTGGA AGCACTAGTC GCTCTAGCTA TCTTTGCCAG CATTGCCACC CTCCTTTTGG 4380 

GACAAATTCA AAAAAATAGG CAAGAGGAAG CAAAAATCTT GCAAAAGGAA GAAGTCTTGA 4440 

GGGTAGCTAA GATGGCCCTG CAGACGGGGC AAAATCAGGT AAGCATCAAC GGAGTTGACA 4500 

TTCAGGTATT TTCTAGTGAA AAAGGATTGG AGGTCTACCA TGGTTCAGAA CAGTTGTTGG 4560 

CAATCAAAGA GCCATAAGGT CAAGGCTTTT ACCTTGTTAC AATCCCTGCT TGCCCTCATT 4620 

GTCATCAGTG GCGGATTACT CCTTTTTCAA GCTATGAGTC AGCTCCTCAT TTCAGAAGTT 4680 

CGCTACCACC AACAAAGCGA GCAAAAGGAG TGGCTCTTGT TTGTGGACCA ACTTGAGGTA 4740 

GAATTAGACC GTTCGCAGTT CGAAAAAGTA GAAGGCAATC GCCTATACAT GAAGCAAGA7 4800 

GGCAAGGACA TCGCCATCGG TAAGTCAAAG TCAGATGATT TCCGTAAAAC GAATGCTCGT 4860 

GGTCGAGGTT ATCAGCCTAT GGTTTATGGA CTCAAATCTG TACGGATTAC AGAGGACAAT 4920 

CAACTGGTTC GCTTTCATTT CCAGTTCCAA AAAGGCTTAG AAAGGGAGTT CATCTATCGT 4980 

GTGGAAAAAG AAAAAAGTTA AGGCAGGTGT TCTCCTCTAC GCAGTCACCA TAGCAGCCAT 5040 

CTTTAGTCTT TTGTTGCAAT TTTATTTGAA CCGACAAGTC GCCCACTATC AAGACTATGC 5100 

TTTGAATAAA GAAAAATTGG TTGCTTTTCC TATGGCTAAA CGAACCAAAG ATAAGGTTGA 5160 

GCAAGAAAGT GGGGAACAGT TTTTTAATCT AGGTCAGGTA AGCTATCAAA ACAAGAAAAC 5220 
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TGGCTTAGTG ACGAGGGTTC GTACCGATAA GAGCCAATAT GAGTTTCTGT TTCCTTCAGT 5280 

CAAAATCAAA GAAGAGAAAA GAGATAAAAA GGAAGAGGTA GCGACCGATT CAAGCGAAAA 5340 

AGTGGAGAAG AAAAAATCAG AAGAGAAGCC TGAAAAGAAA GAGAATTCAT AGTCAATTCA 5400 

ACTATAATGC GTTGAATCCA GAATAGTCCA CTGTAGTTTC TAGAAAATTG CTGGAAATGG 5460 

ATGTTAAGCT CCAATTCATT TGTTTATATC TTATTTCAGT TTACTATACT TTGTGCTAAA 5520 

TTAAACATAT GAAACATGAT TTTAACCACA AAGCAGAAAC TTTCGATTCC CCTAAAAATA 5580 

TCTTCCTCGC AAACTTGGTA TGTCAAGCAG CCGAGAAACA GATTGATCTT CTATCAGACA 5640 

AAGAAATTTT AGATTTCGGT GGTGGCACGG GTCTATTAGC CTTGCCCCTA ACCCCTAGCC 5700 

AAGCAGGCTA AGTCAGTCAC TCTTCTAGAC ATTTCTGAGA AAATGTTGGA GCAAGCTCGT 5760 

TTGAAAGTGG AGCAGCAAGC AATCAAGAAT ATCCAGTTTT TGGAGCAAGA TTTACCGAAA 5820 

AATCCCTTGG AGAAAGAGTT TGATTGCCTT GCTGTTAGTC GGGTTCTTCA TCATATGCCT 5880 

GATTTGGATG CGGCTCTCTC ACTGTTTCAT CAACATTTGA AGGAAGATGG GAAACTCATC 5940 

ATTGCTGATT TTACCAAGAC AGAAGCTAAT CATCATGGAT TTGATTTAGC TGAACTGGAA 6000 

AACAAGCTAA TTGAGCATGG TTTTTCATCT GTGCATAGTC ACATTCTCTA TAGTGCTGAA 6060 

GACCTGTTTC AAGGAAATCA CTCAGAATTC TTTTTAATAC TAGCCCAAAA ATCACTCGCC 6120 

TAGTCAGGGA GTGATTTTTC TATAAGGATG GAAAAAAGAA GGGAAATTTG GTAAGATAGG 6180 

AATATGGATT TTGAAAAAAT TGAACAAGCT TATACCTATT TACTAGAGAA TGTCCAAGTC 6240 

ATCCAAAGTG ATTTGGCGAC CAACTTTTAT GACGCCTTGG TGGAGCAAAA TAGCATCTAT 6300 

CTGGATGGTG AAACTGAGCT AAACCAGGTC AAGGAGAACA ATCAAACCCT TAAGCGTTTA 6360 

GCACTACGCA AAGAAGAATG GCTCAAGACC TACCAGTTTC TCTTGATGAA GGCTGGGCAA 6420 

ACAGAACCCT TGCAGGCCAA TCACCAGTTT ACACCGGATG CTATTGCTTT GCTTTTGGTG 6480 

TTTATTGTGG AAGAGTTGTT TAAAGAGGAG GAAATTACTA TCCTCGAAAT GGGTTCTGGG 6540 

ATGGGAATTC TAGGCGCTAT TTTCTTGACC TCGCTTACTA AAAAGGTGGA TTACTTGGGA 6600 

ATGGAACTCG ATGATTTGCT GATTGATCTG GCAGCTAGCA TGGCAGATGT AATTGGTTTG 6660 

CAGGCTGGCT TTGTCCAAGG AGATGCCGTT CGCCCACAAA TGCTCAAAGA AAGCGATGTG 6720 

GTCATCAGTG ACTTGCCTGT CGGCTATTAT CCTGATGATG CCGTTGCGTC GCGCCATCAA 6780 

GTTGCTTCTA GCCAAGAACA TACTTACGCC CATCACTTGC TCATGGAACA AGGGCTTAAG 6840 

TACCTCAAGT CAGACGGATA CGCTATTTTT CTAGCTCCGA GTGATTTGTT GACCAGTCCT 6900 

CAAAGTGATT TGTTAAAAGA ATGGCTGAAA GAAGAGGCGA GTCTGGTTGC TATGATTAGT 6960 

CTGCCTGAAA ATCTCTTTGC TAATGCCAAA CAATCTAAGA CTATTTTTAT CTTACAGAAG 7020 
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AAAAATGAAA TAGCAGTAGA GCCTTTTGTT TATCCACTTG CTAGCTTGCA AGATGCAAGT 7080 

GTTTTAATGA AATTTAAAGA AAATTTTCAA AAATGGACTC AAGGTACTGA AATATAAAAT 7140 

AGATTTTGTT ATAATAGTTG AAAACGCTTA AAAAGGGGTA TCATGTTATG ACAAAAACAA 7200 

TTGCAATCAA TGCAGGAAGT TCAAGTTTGA AATGGCAATT ATACTTAATG CCAGAAGAAA 7260 

AAGTATTGGC GAAAGGTTTG ATTGAACGTA TCGGTTTGAA AGATTCAATT TCAACTGTAA 7320 

AATTTGACGG CCGTTCTGAA CAACAAATTT TGGATATTGA AAATCATATA CAAGCCGTTA 7380 

AAATTTTATT GGATGACTTG ATTCGTTTCG ATATTATCAA GGCTTATGAC GAGATTACAG 7440 

GTGTTGGACA TCGTGTTGTT GCTGGTGGAG AATATTTCAA AGAATCAACA GTTGTTGAGG 7500 

GAGATGTTTT AGAAAAAGTT GAAGAGTTGA GTTTGTTGGC TCCTCTACAC AACCCGGCCA 7S60 

ATGCAGCAGG TGTTCGTGCC TTCAAGGAAT TGTTGCCAGA CATTACCAGT GTAGTTGTTT 7620 

TTGATACTTC CTTCCACACA AGTATGCCAG AGAAAGCTTA TCGCTACCCT CTACCAACAA 7680 

AATATTACAC AGAAAACAAG GTTCGTAAAT ACGGTCCTCA TGGTACAAGT CACCAGTTTG 7740 

TAGCAGGAGA AGCTGCAAAA CTCTTGGGAC GTCCATTAGA AGACTTGAAG TTAATTACCT 7800 

GTCATATTGG TAACGGAGGC TCAATTACAG CTGTGAAAGC CGGCAAATCT CTACACACTT 7860 

CTATGGGGTT CACTCCTCTT GGTGGTATTA TGATGGGAAC GCGTACAGGG GATATTGATC 7920 

CACCTATCAT TCCTTATTTA ATGCAATATA CAGAGGATTT TAACACACCA CAAGATATCA 7980 

GTCCTGTTCT TAACCGTGAA TCAGGTCTTT TGGGAGTTTC TGCTAATTCT AGCGATATGC 8040 

GCGATATAGA AGCAGCTGTA GCAGAAGGGA ATCACGAGGC TAGCTTGGCT TATGAAATCT 8100 

ATGTTGACCG TATCCAAAAA CATATCGGTC AGTACCTTGC AGTGCTAAAT GGAGCAGATG 8160 

CCATTGTTTT CACAGCAGGT GTCGGTGAAA ATGCACAGAG TTTCCGTCGT GATGTAATCT 8220 

CACGGATTTC GTGGTTTGGT TGTGATGTTG ATGATGAAAA GAATGTCTTT GGCGTTACAG 8280 

GAGACATCTC AACAGAGGCA GCTAAAATCC GTGTCTTGGT TATTCCAACA GATGAAGAAT 8340 

TAGTCATTGC CCGTGACGTT GAACGCTTGA AAAAATAAGT GAAACTAAAA AAATATTCAA 8400 

TACAAGGAGT TGGGAAAGTT ATTTTTCCAG CTTCTTTTTC TGATGAAATT GTCCAAAACC 8460 

TTGCTATGAT TGGCTTTTTT GAAAAATATG GTATAATAGT AGTAATTTAA TAGATGGAGT 8520 

TGAGTTTTGA AGAAAAACTT TCGTGTAAAA AGAGAGAAAG ATTTTAAGGC GATTTTCAAG 8580 

GAGGGGACAA GTTTTGCTAA TCGCAAATTT GTGGTCTACC AATTAGAAAA CCAGAAAAAC 8640 

CGTTTTCGAG TAGGTCTATC AGTTAGCAAA AAACTGGGGA ATGCCGTCAC TAGAAATCAA 8700 

ATTAAGCGAC GGATTCGGCA TATTATCCAG AATGCAAAAG GGAGTCTGGT AGAAGATGTC 8760 
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GACTTTGTTG TCATTGCTCG AAAAGCAGTC GAAACCTTGG GA'PftGC 7AGA GATGGAGAAA 8820 

AATCTACTCC ATGTATTAAA ATTATCAAAG ATTTACCGGG AAGGAAATGG GAGTGAAAAA 8880 

GAAACTAAAG TTGACTAGTT TGCTAGGACT GTCTCTGTTA ATCATGACAG CCTGTGCGAC 8940 

TAATGGGGTA ACTAGCGATA TTACAGCCGA ATCGGCTGAT TTTTGGAGTA AATTGGTTTA 9000 

CTTCTTTGCG GAAATCATTC GCTTTTTATC GTTTGATATT AGTATCGGAG TGGGGATTAT 9060 

TCTCTTTACG GTCTTGATTC GTACAGTCCT CTTGCCAGTC TTTCAGGTGC AAATGGTGGC 9120 

TTCTAGGAAA ATGCAGGAAG CTCAGCCACG CATTAAGGCG CTTCGAGAAC AATATCCAGG 9180 

TCGAGATATG GAAAGCAGAA CCAAACTAGA GCAGGAAATG CGTAAAGTAT TTAAAGAAAT 9240 

GGGTGTCAGA CAGTCAGACT CTCTTTGGCC GATTTTGATT CAGATGCCGG TTATTTTGGC 9300 

CCTGTTCCAA GCCCTATCAA GACTTGACTT TTTAAAGACA GGTCATTTCT TATGGATTAA 9360 

CCTTGGTAGT GTGGATACAA CCCTTGTTCT TCCGATTTTA GCAGCAGTAT TCACCTTTTT 9420 

AAGTACTTGG TTGTCCAACA AAGCTTTGTC TGAGCGAAAT GGCGCTACGA CTGCGATGAT 9480 

GTATGGGATT CCAGTCTTGA TTTTTATCTT TGCAGTTTAT GCGCCAGGTG GAGTCGCCCT 9540 

ATACTGGACA GTGTCTAATG CTTATCAAGT CTTGCAAACC TATTTCTTGA ATAATCCATT 9600 

CAAGATTATC GCAGAGCGCG AGGCCGTAGT ACAGGCACAA AAAGATTTGG AAAATAGAAA 9660 

AAGAAAAGCC AAGAAAAAGG CTCAGAAAAC GAAATAAATA AGGAGGAATC TGGTAGTGGT 9720 

AGTATTTACA GGTTCAACTG TTGAAGAAGC AATCCAGAAA GGATTGAAAG AATTAGATAT 9780 

TCCAAGAATG AAGGCTCATA TCAAAGTCAT TTCTAGGGAG AAAAAAGCCT TTCTTGGTCT 9840 

ATTTGGTAAA AAACCAGCCC AAGTGGATAT TGAAGCGATT AGTGAAACGA CTGTTGTCAA 9900 

AGCAAATCAA CAGGTAGTAA AAGGCGTTCC GAAAAAAATC AATGATTTGA ACGAGCCTGT 9960 

GAAGACGGTT AGTGAAGAAA CCGTTGACCT TGGTCATGTG GTTGATGCTA TTAAAAAAAT 10020 

AGAGGAAGAA GGTCAAGGTA TTTCTGATGA AGTCAAGGCT GAAATCTTAA AACATGAAAG 10080 

ACATGCCAGC ACTATCTTAG AAGAAACTGG TCACATTGAG ATTTTAAATG AACTTCAAAT 10140 

CGAGGAAGCG ATGAGGGAAG AAGCAGCCGC TGATGACCTT GAAACTGAGC AAGACCAAGC 10200 

TGAAAGTCAA GAACTAGAAG ACTTGGGCTT GAAAGTTGAA ACGAACTTTG ATATTGAACA 10260 

AGTAGCTACG GAAGTAATGG CTTATGTTCA AACGATTATT GATGACATGG ATGTTGAGGC 10320 

TACACTTTCA AATGATTATA ACCGTCGTAG CATCAATCTA CAAATTGACA CCAACGAACC 10380 

AGGTCGTATT ATCGGCTACC ATGGTAAAGT CTTGAAGGCC TTGCAACTGT TGGCTCAAAA 10440 

TTATCTTTAC AACCGCTATT CC AG AACCTT CTACGTTACA ATCAATGTCA ATGATTATGT 10500 

CGAACACCGT GCAGAAGTCT TGCAGACCTA TGCGCAAAAA TTGGCGACTC GTGTTTTGGA 10560 



WO 98/18931 



PCT/US97/19588 



361 

AGAAGGGCCC AGTCATAAAA CAGATCCAAT GTCAAATAGC GAACGCAAGA TTATCCATCG 10620 

TATTATTTCA CGTATGGATG GCGTGACTAG TTACTCTGAA GGTGATGAGC CAAATCGCTA 10680 

TGTTGTTGTA GATACAGAAT AAGTAAAATC AGGTTTATCC TGATTTTTTG CTAGTTAGAG 10740 

GAGGTTAAAC TGATGTTGAA TAAGATAAGA GACTATTTAG ACTTTGCTGG TTTGCAGTAC 10800 

CGTAATCCTG ATAAAGCGGG AGCAGAGCGA GAGAAGATGC TGGCATTCCG CCACAAAGGA 10860 

CAAGAGGCCC GAAAGGTTTT TACAGAACTG GCCAAAGCCT TTCAAGCAAG CCATCCAGAA 10920 

TGGCAACTCC AACAGACTAG CCAGTGGATG AATCAGGCCC AGCGTTTGAG ACCACATTTT 10980 

TGGGTTTATC TACAGAGAGA CGGACAAGTG ACAGAACCTA TGATGGCCTT ACCTTTGTAT 11040 

GCGACATCTA CTGACTTTGG AATTTCTTTG GAAGTCAGTT TCATCCAACG TAAGAAGGAT 11100 

GAGCAAACAC TGGGCAAGCA GGCCAAAGTT TTAGACATTC CAACCGTTAA AGGGATTTAT 11160 

TATCTAACCT ACTCTAATGG TCAAAGTCAA CGGTGGGAGG CGAATGAAGA AAAGCGTCGT 11220 

ACTTTACGCG AGAAGGTGAG AAGTCAAGAA GTTCGAAAAG TTTTAGTGAA GGTAGATGTT 11280 

CCTATGACAG AAAATTCGTC TGAAGAAGAA ATCGTAGAAG GCTTATTGAA GTCTTATTCT 11340 

AAAATTCTTC CCTATTATCT AGCTACGAGA AAATAAGATA ATTTGTAAAA CATCATAAAT 11400 

CATACAGTCC AAGAGTGAAC AGTCCGCTGT GTAATTCTTG GTCTTTTTGT TTGCGCTTTC 11460 

GCATTATATA ATAAACTTAC AAAAACAATT CAAAAGGAGA ACAATTATGG AAGTCGTTTC 11520 

AAGTGTTCTA AATTGGTTTT CTACCAATAT TTTGCAGAAT CCCCCATTTT TCGTAGGTTT 11580 

ATTGGTGTTG ATAGGATATG CACTTTTGAA AAAACCTGCC CATGACGTTT TTTCAGGGTT 11640 

TGTTAAAGCA ACAGTAGGGT ATATGTTGCT TAACGTGGGT GCTGGTGGTT TGGTTACAAC 11700 

CTTTCGTCCA ATCTTAGCAG CTCTTAACTA CAAATTCCAA ATTGGTGCAG CGGTTATCGA 11760 

CCCTTACTTT GGACTTGCTG CAGCAAACAA CAAAATTGTA GCAGAGTTTC CAGATTTTGT 11820 

TGGAACTGCA ACTACAGCTC TATTGATTGG TTTTGGAATA AATATCTTGC TCGTAGCTCT 11880 

TCGAAAGATT ACGAAGGTAA GAACCCTCTT TATTACTGGT CACATCATGG TACAACAAGC 11940 

TGCAACAGTA TCTCTTATGG TTCTATTCTT AGTACCACAA TTGCGCAATG CTTACGGTAC 12000 

AGCAGCGATT GGTATCATCT GTGGACTTTA CTGGGCAGTT AGTTCAAATA TGACTGTTGA 12060 

GGCAACTCAA CGCTTGACTG GTGGTGGCGG ATTTGCGATT GGTCACCAAC AGCAATTTGC 12120 

AATCTGGTTT GTAGATAAAG TAGCAGGACG CTTTGGTAAG AAAGAAGAAA GTTTAGACAA 12180 

TCTTAAATTA CCTAAGTTCC TCTCAATCTT CCACGATACA GTTGTTGCAT CTGCTACCTT 12240 

GATGCTCGTA TTCTTCGGAG CCATTCTTTT AATCTTGGGT CCAGACATTA TCTCTAATAA 12 300 
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AGAAGTCATC ACTTCAGGAA CTCTATTCAA TCCTGCTAAA CAAGATTTCT TTATGTACAT 123 60 

TATCCAAACA GCCTTTACCT TCTCAGTTTA CTTGTTCGTT TTGATGCAAG GTGTCCGAAT 12420 

GTTCGTATCT GAGTTGACAA ACGCCTTCCA AGGTATTTCA AACAAATTGT TGCCAGGTTC 124 80 

ATTCCCAGCG GTTGACGTTG C AGCTTCTTA TGGATTTGGT TCTCCAAATG CTGTCTTGTC 12540 

AGGATTTACC TTTGGTTTGA TTGGTCAATT GATTACAATT GTTTTGCTCA TCGTCTTTAA 12 600 

AAATCCGATT CTTATTATTA CAGGATTTGT ACCAGTGTTC TTTGACAATG CAGCCATTGC 12660 

GGTCTACGCT GATAAACGCG GCGGATGGAA AGCGGCTGTT ATCCTTTCCT TTATATCAGG 12720 

TGTCCTTCAA GTTGCTCTAG GAGCTCTTTG TGTGGCCCTT CTCGATTTGG CATCTTATGG 12780 

TGGCTACCAT GGAAATATCG ACTTTCAATT CCCATGGCTT GGATTTGGAT ATATCTTCAA 12840 

ATACCTTGGT ATTGTTGGTT ATGTACTTGT GTGTCTCTTC TTGCTTGTTA TTCCTCAACT 12900 

TCAATTTGCC AAAGCAAAAG ATAAAGAGAA ATATTACAAC GGTGAAGTTC AAGAAGAAGC 12960 

TTAGTATCTA GAAAAGGAGA AATAAAATGG TTAAAGTATT AGCAGCGTGC GGAAATGCAA 13020 

TGGGTTCATC AATGGTTATC AAGATGAAGG TTGAAAATGC TCTCCGTAAG CTTAATCAAA 13080 

CAGATTTTAC AGTCAATTCA TGCAGTGTCG GTGAAGCTAA AGGTTTAGCA GTAGGATATG 1314 0 

ACATCGTAAT CGCTTCTCTT CATTTGATTC AAGAATTGGA AGGGCGAACT AATGGGAAGT 13200 

TAATTGGGCT TGATAACTTG ATGGATGATA AAGAAATCAC CGAAAAACTC AGTCAAGCAC 13260 

TACAGTAAAA GGTTGGAGGG GGCTGGACAG AAACTGAGAG TTATCGTTTC TGTCCTTCTC 13 320 

CCTCTT7AAA TAAAGGAGGC AGATATGAAT TTAAAACAAG CTTTAATTGA CAATGACTCG 13380 

ATCCGACTAG GTTTAGAGGC TAACAATTGG AAAGAAGCAG TCAACGTACC AGTAGATCCC 13440 

TTAATTGAAA GTGGGGCAAT TTTGCCAGAG TATTACGATG CTATCATTGA ATCGACTGAA 13 500 

GAGTATGGGC CTTACTATAT CTTGATGCCA GGTATGGCTA TGCCCCACCC TAGACCTGAA 13560 

GCAGGTGTGC AAAGTGATGC CTTTTCATTG ATTACCTTAC AAAATCCTCT TGTATTTTCA 13620 

GATGGGAAAG AGGTATCTGT TTTGTTGGCA CTAGCAGCAA CAAGTTCAAA AATTCACACA 13680 

AGTGTAGCCA TTCCACAAAT TATTGCCCTA TTTGAATTAG AAGATTCTAT TGCACGTTTA 13740 

CAGGCTTGCC AGACTAAAGA AGATGTCTTG GCTATGATTG AAGAATCTAA GGATAGCCCT 13800 

TATCTCGAAG GATTGGATTT GGAAAGTTAG AAAGAGGAAT AAAGAAATGA CAAAAAGAAT 13860 

ACCTAATTTA CAAGTTGCAT TAGACCATTC AGACTTGCAA GGAGCGATTA AAGCAGCTGT 13920 

TTCTGTTGGT CAGGAAGTAG ATATTATCGA AGCTGGAACT GTTTGCTTGC TTCAAGTTGG 13 980 

AAGTGAACTG GCTGAAGTCT TGCGTAGCCT TTTCCCACAT AAGATTATTG TGGCAGACAC 14040 

AAAATGTGCT GATGCTGGTG GAACAGTTGC TAAAAATAAT GCGGTTCGTG GAGCAGACTG 14100 
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GATGACTTGT 


ATC7GTTGTG 


CAACCATCCC 


TACTATGGAA 


GCAGCTCTAA 


AGGCTATCAA 


14160 


GACTGAACGA 


GGAGAACGAG 


GCGAAATCCA 


GATCGAGCTT 


TATGGCGATT 


GGACTTTTGA 


14220 


ACAAGCTCAG 


CTTTGGCTAG 


ATGCAGGTAT 


CTCACAAGCT 


ATTTATCACC 


AATCTCCTGA 


14280 


TGCTCTTCTT 


GCTGGTGAAA 


CTTGGGGTGA 


AAAAGACCTT 


AATAAGGTTA 


AAAAACTCAT 


14340 


TGACATGGGC 


TTCCGTGTAT 


CTGTAACAGG 


TGGTCTAGAT 


GTAGATACTC 


TCAAACTCTT 


14400 


TGAAGGTATT 


GATGTCTTTA 


CCTTTATCGC 


AGGTCGTGGA 


ATTACAGAGG 


CTGTGGATCC 


14460 


AGCAGGAGCA 


GCGCGTGCCT 


TCAAGGATGA 


AATCAAACGA 


ATTTGGGGGT 


AAATCATGGT 


14520 


ACGTCCAATT 


GGAATTTATG 


AAAAGGCAAC 


CCCAACACAC 


TGTACTTGGC 


TAGAACGTTT 


14580 


AAATTTTGCC 


AAGGAGTTAG 


GCTTTGATTT 


TGTCGAGATG 


TCTATTGACG 


AACGTGACGA 


14640 


GCGTTTAGCA 


AGACTTGACT 


GGAGTAAGGA 


AGAACGCTTG 


GAAGTTGTCA 


AAGCAATCTA 


14700 


TGAAACTGGT 


GTTCGTATTC 


CTTCTATCTG 


TTTTTCACGC 


CATCGTCGCT 


ACCCATTGGG 


14760 


TTCAAAAGAT 


CCAGTTCTAG 


AGGAAAAATC 


TCTAGAACTC 


ATGAAAAAAT 


GTATCGAATT 


14820 


AGCTCAAGAC 


TTGGGAGTTC 


GTACGATTCA 


ATTAGCTGGT TACGATGTTT 


ACTATGAGGA 


14880 


AAAGTCACCC 


CAGACACGCC 


AACGTTTTAT 


CAAAAATTTG 


AGAAAAGCCT 


GTGACTGGGC 


14940 


TGAAGAAGCT 


CAGGTGGTAC 


TTGCTATTGA 


AATTATGCAT 


GATCCTTTCA 


TCAGTAGCAT 


15000 


CGAAAAATAT 


TTGGCTATAG 


AAAAAGAGAT 


TGACTCTCCC 


TTCCTCTTTG 


TATATCCAGA 


15060 


TATTGGTAAT 


GTGTCTGCAT 


GGCATAATGA 


TATCTATAGT 


GAGTTTTATC 


TTGGTCATCA 


15120 


TGCCATCGCA 


GCTCTCCATC 


TCAAGGATAC 


TTATGCAGTG 


ACAGAAACTT 


CAAAGGGCCA 


15180 


GTTCCGAGAT 


GTACCTTTCG 


GGCAAGGTTG 


TGTCAAATGG 


GAAGAACCTT 


TCGATATTTT 


15240 


AAAGGAAACC 


AATTATAATG 


GACCTTTCCT 


AATCGAAATG 


TGGTCTGAAA 


ATTCTGAAAC 


15300 


AGTAGAAGAA 


ACACGCGCAG 


CCATTCAAGA 


GGCGCAACCT 


TTTCTCTATC 


CACTCATTAA 


15360 


GAAAGCAGGT 


TTGATGTAAG 


ATGAATCAAG 


TAATCAATGC 


TATGCGTAAA 


CGAGTCTGTG 


15420 


ATGCCAATCA 


ATCATTGCCA 


AAACATGGAC 


TTGTCAAATT 


TACCTGGGCG 


AATGTATCTG 


15480 


AAGTTAATCG 


CGAACTCGGT 


GTCATTGTTA 


TCAAACCATC 


AGGCGTGGAT 


TATGACGAAT 


15540 


TGACACCTGA 


AAACATGGTA 


GTGACTGATC 


TAGATGGTAA 


GATCCTAGAA 


GGGGATTTAA 


15600 


GACCATCTTC 


CGACCTCCCA 


ACTCATGTGC 


AATTATATAA 


GACTTGGTCA 


GAAATTGGTA 


15660 


GTGTGGTTCA 


CACCCATTCG 


ACAGAAGCTG 


TTGGTTGGGC 


TCAGCCAGGT 


CGTGATATTC 


15720 


CTTTCTACGG 


AACAACCCAT 


GC AG ATT ATT 


TCTACGGTTC 


AATCCCTTGC 


GCCCGTAGTT 


15780 


TGACCAAGGA 


CGAAGTAGAA 


GTGGCCTATG 


AAAAAGATAC 


TGGCCTGGTT 


ATCGTAGAAG 


15840 
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AGTTTGAACA 


TCGCGGACTT 


AACCCGGTTG 


AAGTACCAGG 


AATTGTTGTA 


CGCAATCACG 


15900 


GTCCATTCAC 


CTGGGGCAAA 


AATCCAGAGA 


ATGCTGTTTA 


TCACTCTGTC 


GTACTAGAGG 


15960 


AAGTATCAAA 


GATGAATCGC 


TTTACAGAAC 


AAATCAATGC 


AAGAGTTGGA 


CCTGCTCCCC 


16020 


AGTACATACT 


AGAAAAACAC 


TACCAACGTA 


AACATGGACC 


AAATGCTTAT 


TATGGTCAAA 


16080 


AGTAAGAACG 


ATGAAGGAGG 


AGAAAAAGAT 


AAATTT AG CT 


CCTCTTTTTA 


CATTTGATTT 


16140 


TTATTGAGAG 


TAAAGTTGGA 


GTTGAAGTAA 


TTTTAAAAGA 


TTTTTTAGAA 


ATAGCGCTTG 


16200 


ATATATATAT 


GGTAAAATAA 


AAAGAATTGC 


TGTGATATCA 


ATAGATTTGG 


GGGATTTTTT 


16260 


AATATGGTAC 


TGGATAAGGC 


AAGTTGTGAT 


TTGCTTCAAT 


ATTTGATGGA 


TCAAGAAACG 


16320 


TCCAAAACGA 


TTATGGCGAT 


TTCGAAAGAT 


TTGAAAGAGT 


CAAGAAGGAA 


AATTTATTAT 


16380 


CACATTGACA 


AAATCAATGC 


TGCTCTGGGT 


GACGAGGCGC 


TTCACATCAT 


TAGTATTCCA 


16440 


CGAATTGGTA 


TTCACTTAAC 


GGAAGAGCAG 


AGAGATGCTT 


GTTGTAAACT 


ATTATCGGAA 


16500 


GTAGATTCGT 


ACGATTATAT 


CATGAGTGCG 


CATGAACGTA 


TGATGATAAT 


GTTACTATGG 


16560 


ATAGGTATTT 


CTAAAGAACG 


TATTACGATT 


GAAAAATTGA 


TAGAGTTAAC 


AGAGGTATCT 


16620 


AGGAATACTG 


TTCTCAATGA 


TTTGAATAGT 


ATTCGTTATC 


AACTAACTTT 


GGAACAATAT 


16680 


CAGCTGATCT 


TGCAAGTGAG 


CAAGTCACAG 


GGATACAACC 


TTCATGCCCA 


CCCTCTTAAT 


16740 


AAAATTCAGT 


ATCTTCAATC 


GCTTCTATAT 


CATATTTTTA 


TGGAAGAAAA 


TCCCACTTTT 


16800 


GTATCTATTT 


TAGAAGATAA 


GATGAAAGAG 


AGGTTAGATG 


ATGAGTGTTT 


GCTTTCTGTT 


16860 


GAAATGAACC 


AATTTTTTAA 


GGAACAGGTT 


CCTTTAGTTG 


AACAAGATTT 


AGGGAAGAAA 


16920 


ATAAACCATC 


ATCAAATAAC 


TTTTATGTTG 


CAGGTTCTAC 


CTTATTTCCT 


GTTAAGCTGT 


16980 


CATAATGTTG 


AACAGTATCA 


AGAAAGACAT 


CAGGATATAG 


AGAAAGAATT 


TTCTTTGATA 


17040 


AGAAAAAGAA 


TAGAGTATCA 


GGTGTCTAAG 


AAATTAGGAG 


AACGGTTGTT 


TCAAAAGTTT 


17100 


GAAATTTCTT 


TGTCACGACT 


TGAAGTTTCT 


CTTGTAGCTG 


TTCTCCTCCT 


CTCCTATCGT 


17160 


AAAGATTTGG 


ATATTCATGC 


AGAAAGTGAT 


GA 1 1 1 i v_C»ot_ 




1 \JV» 111 /»v»f t/\ 


17220 

X- 1 tr *r \J 


GAATTTATCT 


GGTATTTTGA 


ATCACAAATC 


CGAATGGAGA 


TTGAGAACAA 


GGATGATTTG 


L7280 


TTACGAAATT 


TGATGATCCA 


CTGTAAAGCC 


TTGTTATTTA 


GAAAGACTTA 


CGGTATTTTT 


17340 


TCTAAAAATC 


CTCTAACAAA 


ACAAATTCGA 


TCCAAGTATG 


GAGAATTATT 


TTTAGTCACT 


17400 


AGAAAATCTG 


CGGAAATTTT 


AGAAGGAGCA 


TGGTTTATTC 


GGCTAACAGA 


CGATGATATT 


17460 


GCCTATTTGA 


CGATTCATAT 


TGGAGGATTT 


TTAAAATATA 


CACCATCATC 


TCAAAAAAAT 


17520 


ATGAAAAAAG 


TTTATCTCGT 


TTGTGATGAA 


GGTGTTGCGG 


TTTCGAGACT 


TTTGCTGAAA 


17580 


CAATGCAAAC 


TTTATTTTCC 


AAATGAGCAA 


ATTGACACTG 


TATTTACAAC 


AGAACAATTT 


17640 
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AAGAGTGTGG AAGATATTGC ACAAGTTGAT GTAGTGATTA CTACTAATGA TCATTTGGAT X7 7 00 

AGCAGATTTC CGATTTTAAG GGTTAATCCT ATCCTTGAAG CAGAAGATAT TTTGAAAATG 17760 

CTAGACTATC TTAAACACAA TATATTTCGT AATAAGAGCA AAAGTTTCAG TGAAAATCTT 17820 

TCTAGTCTTA TTTCGTCTTA TATTGTAGAC AGCAAGTTGG CTAGTAAGTT CCAAGAAGAG 17880 

GTTCAAACAC TTATAAATCA AGAAATAGTA GTTCAAGCTT TTTTGGAAGr TATTTGAAGG 17940 

ACAGTCCAAT GATGAACACA AACCTGTGTk TTTCsTGGTC TTTTtTACTG TTTTGAAGGG 18OO0 

TGGkATACTA ATCTCAAAGA TAACAATTAT ATCCAAAGGA GGCAACATAT GCCAAACGTC 18060 

AAAGAAATTA CAAGAGAGTC ATGGATTTTA GCCACTTTCC CAGAGTGGCG AACATGGTTG 18120 

AACGAAGAAA TCGAAGAAGA AGTCGTACCT GAAGGCAACT TTGCCATGTG GTGGCTAGGC 18180 

AACTGTGGTA CTTGGATTAA GACACCAGCT GGTGCTAACG TTGTCATGGA CCTTTGGTCA 18240 

AACCGTGGAA AATCAACCAA AAAAGTGAAA GATATGCTTC GTGGGCACCA AATGGCAAAT 18300 

ATGGCAGGTG TTCGTAAGCT GCAACCAAAC TTGCGTGTTC AGCCAATGGT TATCGATCCA 18360 

TTTCCTATCA ACGAACTAGA CTATTACTTA GTTTCACACT TCCACAGTGA TCATATCGAC 18420 

C CAT AC AC AG CTGCAGCAAT TCTCAATAAT CCTAAGTTAG AGCATCTTAA GTTGG 18475 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7186 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNSSS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCAGGATTTG GTACCGTTGC AAGTGGTGTG CCTTTCCTCC TAAAGGAAAA TGGAGGAAAA 60 

ATCAATCAAT CAGCACATTC AGATATCAAA GTTGCTAAGG TATTGGTCAA GGATGAAGAT 120 

GAAAAAAATC GCTTGCTTGC AGCAGGGAAT GACTTTAACT TTGTAACCAA TGTGGATGAT 180 

ATTTTATCAG ACCAGGATAT TACTATCGTA GTGGAATTGA TGGGGCGTAT TGAGCCTGCT 2 40 

AAAACCTTTA TCACTCGTGC CTTGGAAGCT GGAAAACACG TTGTTACTGC TAACAAGGAC 300 

CTTTTAGCTG TCCATGGCGC AGAATTGCTA GAAATCGCTC AAGCTAACAA GGTAGCACTT 3 60 

T ACT ACGAAG CAGCAGTTGC TGGTGGGATT CCAATTCTTC GTACTTTAGC AAATTCCTTG 420 

GCTTCTGATA AAATTACGCG CGTGCTTGGA GTAGTCAACG GAACTTCCAA CTTCATGGTG 4 80 

ACCAAGATGG TGGAAGAAGG CTGCTCTTAC GATGATGCTC TTGCGGAAGC ACAACGTCTA 540 
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GGATTTGCAG AAAGCGATCC GACGAATGAC GTAGATGGGA TTGATGCAGC CTACAAGATG 600 

GTTATTTTGA GCCAATTTGC CTTTGGCATG AAGATTGCCT TTGATGATGT AGCCCACAAG 660 

GGAATCCGCA ATATCACACC AGAAGACGTA GCTGTAGCTC AAGAGCTTGC TTACGTAGTG 720 

AAATTGGTTG GTTCTATTGA GGAAACTTCT TCAGGTATTG CTGCAGAAGT GACTCCAACC 780 

TTCCTACCTA AAGCGCACCC ACTTGCTAGT GTGAATGGCG TXATGAACGC TGTCTTTGTA 840 

GAATCTATCG GTATTGGTGA GTCTATGTAC TACGGACCAG GTGCGGGTCA AAAACCAACT 900 

GCAACAAGTG TTGTAGCTGA TATTGTCCGT ATCGTTCGTC GTTTGAATGA TGGTACTATT 960 

GGCAAAGACT TCAACGAATA TAGCCGTGAC TTGGTCTTGG CAAATCCTGA AGATGTCAAA 1020 

GCAAACTACT ATTTCTCAAT CTTGGCTCTA GACTCAAAAG GTC AGCTCTT GAAGTTGGCT 1080 

GAAATCTTCA ATGCTCAAGA TATTTCCTTT AAGCAAATCC TTCAAGATGG CAAAGAGGGT 1140 

GACAAGGCGC GTGTCGTTAT CATCACACAC AAGATTAATA AAGCCCAGCT TGAAAATGTC 1200 

TCAGCTGAAT TGAAGAAGGT TTCAGAATTC GACCTCTTGA ATACCTTCAA GGTGCTAGGA 1260 

GAATAACATG AAGATTATTG TACCTGCAAC CAGTGCCAAT ATCGGGCCAG GTTTTGACTC 1320 

GGTCGGTGTA GCTGTAACCA AGTATCTTCA AATTGAGGTC TGCGAAGAAC GAGATGAGTG 1380 

GCTGATTGAA CACCAGATTG GCAAATGGAT TCCACATGAC GAGCGTAATC TCTTGCTCAA 1440 

AATCGCTTTG CAAATTGTAC CAGACTTGCA ACCAAGACGC TTGAAAATGA CCAGTGATGT 1500 

CCCTTTGGCG CGCGGTTTGG GTTCTTCCAC CTCGGTTATC GTTGCTGGGA TTGAACTAGC 1560 

CAACCAACTG GGTCAACTCA ACTTATCAGA CCATGAAAAA TTGCAGTTAG CGACCAAGAT 1620 

TGAAGGGCA? CCTGACAATG TGGCTCCAGC CATTTATGGT AATCTCGTTA TTGCAAGTTC 1680 

TGTTGAAGGG CAAGTCTCTG CTATCGTAGC AGACTTTCCA GAGTGTGATT TTCTAGCTTA 1740 

CATTCCAAAC TATGAATTAC GTACTCGCGA CAGCCGTAGT GTCTTGCCTA AAAAATTGTC 1800 

TTATAAGGAA GCTGTTGCTG CAAGTTCTAT CGCCAATGTA GCGGTTGCTG CCTTGTTGGC 1860 

AGGAGACATG GTGACCGCTG GGCAAGCAAT CGAGGGAGAC CTCTTCCATG AGCGCTATCG 1920 

TCAGGACTTG GTAAGAGAAT TTGCGATGAT TAAGCAAGTG ACCAAAGAAA ATGGGGCCTA 1980 

TGCAACCTAC CTTTCTGGTG CTGGGCCGAC AGTTATGGTT CTGGCTTCTC ATGACAAGAT 2040 

GCCAACAATT AAGGCAGAAT TGGAAAAGCA ACCTTTCAAA GGAAAACTGC ATGACTTGAC 2100 

AGTTGATACC CAAGGTGTCC GTGTAGAAGC AAAATAAAGA ATAGAAGATA GGATGGGGAA 2160 

ACTCTTGACC AGAGGGGTTC ATATCCTTTT TGTGAAAAGA AGTTTATACT CAATGAAAAT 2220 

CAAAGAGCAA ACTAGGAAGC TAGCCGCAGG CTGCTCAAAA CAGTGTTTTG ACGTTGCAGA 2280 

TAGAACTGAC GAAGTCAGCT CAAGACACTG TTTTGAGGTT GCAGATAGAA CTGACGAAGT 2340 



WO 98/18931 



PCT/US97/19588 



387 



CAGTAACCAT ACTACGGTAA GGTGACGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTAT 
TAGTTAAAAA CGTGATAAAG GAGAAATAAA GATGGCAGAA ATTTATCTAG CAGGTGGTTG 
TTTTTGGGGC CTAGAGGAAT ATTTTTCACG CATTTCTGGA GTGCTAGAAA CCAGTGTTGG 
CTACGCTAAT GGTCAAGTCG AAACGACCAA TTACCAGTTG CTCAAGGAAA CAGACCATGC 
AGAAACGGTC CAAGTGATTT ACGATGAGAA GGAAGTGTCA CTCAGAGAGA TTTTACTTTA 
TTATTTCCGA GTTATCGATC CTCTATCTAT CAATCAACAA GGGAATGACC GTGGTCGCCA 

CCAGCTATCT ACACAGTGGT 
GAAGTGGAGC AATTACGCCA 



ATATCGAACT GGGATTTATT ATCAGGATGA AGCAGATTTG 
GCAGGAGCAG GAACGCATGC TGGGTCGAAA GATTGCAGTA 
CTACATTCTG GCTGAAGACT ACCACCAAGA CTATCTCAGG AAGAATCCTT CAGGTTACTG 
TCATATCGAT GTGACCGATG CTGATAAGCC ATTGATTGAT GCAGCAAACT ATGAAAAGCC 
TAGTCAAGAG CTGTTCAAGG CCAGTCTATC TGAAGAGTCT TATCGTGTCA CACAAGAAGC 
TGCTACAGAG GCTCCATTTA CCAATGCCTA TGACCAAACC TTTGAAGAGG GGATTTATGT 



AGATATTACG ACAGGTGAGC 
TTGGCCAAGT TTTAGCCGTC 
CCATGGAATG GAGCGAATTG 
TTTCACAGAT GGACCGCGGG 
ACGCTTTGTG GCCAAGGATG 



CACTCTTTTT TGCCAAGCAT AAGTTTGCTT CAGGTTGTGG 
CGATTTCCAA AGAGTTGATT CATTATTACA AGGATCTGAG 
AAGTTCGTTC TCGTTCAGGC AGTGCTCACT TGGGTCATGT 
AGTTAGGCGG CCTCCGTTAC TCTATCAATT CTCCTTCTTT 
AGATGGAAAA AGCAGGATAT GGCTATCTAT TGCCTTACTT 



AAACAAATAA AACAGAGAGT GGGGCTTCCC ACTTTCTTCA TTTCTAGAAT ATGAATAGAA 
AAACACCTAT TATCTTACTT CAAACCCTAC ATCAAGGAAT CAATTTTAGC 



GGGATTTATG 
CCCCTTGTTC 
GATTGTTGAC 
GCTCCTTATC 
AAAGGCAGCA 
CTTGCCCAAG 
GGATACCTAC 
CATTATCGTT 



AAGCTGTTAG AAGCTGTTTT TGAGCTCTTG 
CAATCTTTAC CTCAGGGAGA TCAAGGTCAT 
TTTGCAGTAA TTGGCGTTTT AGTGCCCTTG 



GTTCCCATGG TGATTGCTGG 
CTCTGGATGC AGATTGGCCT 
ATAGCTCAAT TTTACTCAGC 



GTAGGTTCTG CTAAGGAATT GACAAACGAT CTTTATCGTC ATATTCTTTC 

TTGGTCACTC GCTTGACTTC 
CGTCTCTTTT TACGAGCGCC 
TCAGCTGAGT TGACTTTCTG 



GACAGCAGAG ACCGTCTGAC AACTTCTAGT 
CAGATTCAGA CTGGTATCAA TCAATTCCTG 
TTTGGTGCCA TTTTTATGGC TTATCGAATC 



GTTCTTAGTC TTGGTTGCCA TTTTGACCAT TGTCATTGTA GGGTTATCTC GATTGGTCAA 
TCCTTTCTAC AGTAGTCTCA GAAAGAAAAC GGACCAACTG GTTCAGGAAA CGCGCCAGCA 
ATTGCAAGGG ATGCGGG TT A TTCCTGCTTT TGGTCAAGAA AAACGAGAGT TACAGATTTT 



2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3B40 

3900 

3960 

4020 

4080 
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TCAAACCCTT AACCAAGTTT ATGCTAGATT ACAAGAAAAG ACAGGTTTCT GGTCTAGTTT 4140 

ATTAACACCT CTGACCTATC TGATTGTCAA TGGAACTCTT CTCGTTATTA TCTGGCAAGG 4200 

CTATATTTCA ATTCAAGGAG GAGTGCTCAG TCAAGGTGCT CTCATTGCTC TTATCAATTA 4260 

CCTCTTACAG ATTTTGGTGG AATTGGTCAA GCTAGCCATG TTGATCAATT CCCTCAACCA 4320 

GTCCTATATC TCAGTCAAGC GAATCGAGGA AGTCTTTGTT GAGGCTCCAG AGGATATCCA 4380 

TTCAGAGTTA GAACAAAAGC AAGCTACCAG AGATAAGGTT TTACAAGTCC AAGAATTGAC 4440 

CTTTACCTAT CCTGATGCGG CCCAGCCTTC TCTGAGATAC ATTTCCTTTG ATATGACTCA 4500 

AGGACAAATT CTAGGTATCA TCGGGGGAAC TGGTTCTGGT AAATCAAGCT TGGTGCAACT 4560 

CTTACTTGGA CTTTATCCAG TAGACAAGGG GAACATTGAC CTTTATCAAA ATGGACGTAG 4620 

TCCTCTTAAT TTGGAGCAGT GGCGGTCTTG GATTGCCTAT GTACCTCAAA AGGTCGAACT 4680 

CTTTAAAGGA ACCATTCGTT CCAACTTGAC TCTAGGTTTC AATCAAGAAG TATCTGACCA 4740 

GGAACTCTGG CAGGCCTTGG AGATTGCGCA AGCTAAGCAT TTTGTCAGTG AAAAGGAAGG 4800 

ACTCTTGGAT GCTCTAGTTG AGGCAGGGGG GCGAAATTTC TCAGGTGGAC AAAAACAAAG 4860 

ATTGTCTATC GCCCGAGCAG TCTTGCGCCA GGCTCCGTTT CTCATCCTAG ATGATGCAAC 4920 

CTCGGCACTG GATACCATTA CAGAGTCCAA GCTCTTGAAA GCTATTAGAG AAAATTTTCC 4980 

AAACACGAGC TTAATTTTGA TCTCTCAACG AACCTCAACT TTACAGATGG CGGACCAGAT 5040 

TCTCCTCTTG GAAAAAGGTG AGTTGCTAGC TGTTGGCAAG CACGATGACT TGATGAAATC 5100 

CAGCCAAGTC TATTGTGAAA TCAATGCATC CCAACATGGA AAGGAGGACT AGAATGAAAC 5160 

GACAAACTGT AAACCACACG CTCAAACGTT TAGCCGTAGA TTTAGCAACC CATCCTTTCC 5220 

TCCTTTTCCT AGCCTTTCTA GGAACTATTG CCCAAGTTGG CTTATCAATT TACCTACCTA 5280 

TTCTGATTGG GCAGGTCATT GACCAAGTCC TAGTGGCTGG TTCATCACCA GTTTTTTGGC 5340 

AGATTTTTCT CCAGATGCTC TTGGTGGTAA TAGGAAATAC TCTGGTACAA TCGGCCAATC 54 00 

CTCTCCTCTA TAATCGTCTA ATCTTCTCTT ATACCAGAGA TTTACGGGAG CGAATCATCC 5460 

ATAAGCTCCA TCGTTTACCG ATTGCCTTTG TAGATAGGCA AGGTAGTGGA GAGATGGTTA S520 

GTCGTGTAAC CACGGACATC GAACAGTTGG CAGCTGGCTT GACCATGATT TTTAACCAAT 5580 

TTTTCATTGG TGTTTTGATG ATTTTGGTCA GTATTCTAGC CATGCTCCAA ATTCATCTCC 5640 

TCATGACTCT CTTAGTCTTG CTGTTGACGC CACTGTCCAT GGTGATTTCA CGCTTTATTG 5700 

CCAAGAAATC CTATCATCTC TTCCAGAAGC AAACAGAGAC GAGGGGAATT CAGACTCAGT 5760 

TGATTGAAGA ATCGCTTAGT CAGCAGACTA TAATCCAGTC CTTCAATGCT CAAACAGAAT 5820 

TTATCCAAAG ATTGCGTGAG GCTCATGACA ACTACTCAGG CTATTCTCAG TCAGCCATCT 5880 
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TTTATTCTTC AACGGTCAAT CCTTCGACTC GCTTTGTAAA TGCACTCATT TATGCCCTTT 5940 

TAGCTGGAGT AGGAGCTTAT CGTATCATGA TGGGTTCAGC CTTGACCGTC GGTCGTTTAG 6000 

TGACTTTTTT GAACTATGTT CAGCAATACA CCAAGCCCTT TAACGATATT TCTTCAGTGC 6060 

TAGCTGAGTT GCAAAGTGCT CTGGCTTGCG TAGAGCGTAT CTATGGAGTC TTAGATAGCC 6120 

CTGAAGTGGC TGAAACAGGT AAGGAAGTCT TGACGACCAG TGACCAAGTT AAGGGAGCTA 6180 

TTTCCTTTAA ACATGTCTCT TTTGGCTACC ATCCTGAAAA AATTTTGATT AAGGACTTGT 6240 

CTATCGATAT TCCAGCTGGT AGTAAGGTAG CCATCGTTGG TCCGACAGGT GCTGGAAAAT 6300 

CAACTCTTAT CAATCTCCTT ATGCGTTTTT ATCCCATTAG CTCGGGAGAT ATCTTGCTGG 6360 

ATGGGCAATC CATTTATGAT TATACACGAG TATCATTGAG ACACCAGTTT GGTATGGTGC 6420 

TTCAAGAAAC CTGGCTCACA CAAGGGACCA TTCATGATAA TATTGCCTTT GGCAATCCTG 6480 

AAGCCAGTCC AGAGCAAGTA ATTGCTGCTG CCAAAGCAGC TAATGCAGAC TTTTTCATCC 6540 

AACAGTTGCC ACAGGGATAC GATACCAAGT TGGAAAATGC TGGAGAATCT CTCTCTGTCG 6600 

GCCAAGCTCA GCTCTTGACC ATAGCCCGAG TCTTTCTGGC TATTCCAAAG ATTCTTATCT 6660 

TAGACGAGGC AACTTCTTCC ATTGATACAC GGACAGAAGT GCTGGTACAG GATGCCTTTG 6720 

CAAAACTCAT GAAGGGCCGC ACAAGTTTCA TCATTGCTCA CCGTTTGTCA ACCATTCAGG 6780 

ATGCGGATTT AATTCTTGTC TTAGTAGATG GTGATATTGT TGAATATGGT AACCATCAAG 6840 

AACTCATGCA TACAAAGGGT AAGTATTACC AAATGCAAAA AGCTGCGGCT TTTACTTCTC 6900 

AATAAGCCAT TCTCTTTTGA AAGTTTATGG ACGAAAAAAG TTGCCTTCCA GTGACTTTTT 6960 

TGTTACAATA GCTAGAAAAA TTGTTCACTG TAATACTCAA TGAAAATCAA AGAGCAAACT 7020 

AGGAAGCTAG CCGTAGGTTG CTCAAAGCAC AGCTTTGAGG TTGTAGATAA GACTGACGAA 7080 

GTCAGTTCAA AACACTGTTT TGAGGTTGCA GATAGAACTG ACGAAGTCAG CTCAAAACAC 7140 

TGTTTTGAGG TTGCAGATAG AACTGACGAA GTCAGCTCAA AACAGG 7186 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40: 

CTGAAAATTC TAAAAAATTT ATAAGTAAGG AATTAATTAG TTATTTTTGT GATAAAGTTT 60 



WO 98/18931 



PCT/US97/19588 



390 

ATGATGAAAT ATTTGTTGAA GAGGTAGTTC CGCACGTTTT TCTGCCATAT GAATCTGACT 120 

TACTTCTTAT TTTACCAGCT ACGGCAAATG TGATTGGCAA AATTGCTAAT GGTATTGCTG 1B0 

ATGATTTAGT TACAGCAACT GTTTTAAACT TTAATAAAAA AATAATTTTT TGTCCCAATA 240 

TGAACTCTAC TATGTGGGAC AATCACATAG TTCAAAGAAA TGTATCAATT CTAAAGGAGT 300 

TGGGACATAT ATTTTTATTT GAGTCTAAAA AAACATATGA GGTAGGATTG CGTAAAGCAA 360 

TAGATTCAAC ATGTTCAATG TTACAACCAC AGTCGTTAGT AAAAGAACTT ATCAAATTAG 420 

AAAATATTGT CCTTGAAGAG GGACATTAAA AACTACTGAG AATATTAATG AGGGGAAAAA 4 80 

ATGGAAAATT CATCAATCGA TGTAGATATG CTGTTGGAAG AATTGACACA AGAAGCAATG 540 

GTCGTTCTTG CTGTTGATAA GGACTGTTAA TTTAAACTTA TGGCAATATA TGAAAGGTT A 600 

CTGGATGTTT TAAATTATGC AGGCAGTAGC CTTTTATTAT ATACAAATGG ATAAAGTAAG 660 

GATAATACAA TGATTAATAA AAAAATACAA CAAGTTGTTT TGGAATCATT ACAGAATTTT 720 

TTGAATGGGA ACTTCATTTC GCCTTGTGTA GTCTATGATT TTGGCTTGCT GGAAACTGTA 780 

CTTGATGAAT TTAAAAATCA AATTCCTGTA ACATTCAATT ACCAACTTTT TTATGCCGTT 840 

AAAGCAAATT CAAATGAGAA GATACTTGAA TTCTTAGTAG ATAAAATTGA TGGAGTTGAT 900 

GTGGCGTCAT TATCTGAATT AGATGTGGCT AAAAAATTTT TCCCACCAAC TCAAATTTCT 960 

GTTAATGGTC CCGCATTTTC TTATGAAACT TTATATAATC TGATTAAAAA ACAATATAAA 1020 

GTTGATATTA ACTTTTTGGA ACATCTTCAA CAATTTTCCC CAAAAGAATC TGTTGGAATA 1080 

AGAGTAACGG AGCCAGATGA ACTTAATAAT CGTATGAGTC GATTTGGAAT AAATATTTGC 1140 

AGTGATAATT GGACTAGTAA TTTACAAAAT CCTTTAATTA CACGACTGCA TTTTCATTTT 1200 

GGAGAAAAAG ATGATAAATT TATTGTTAAG TTAGA7AAAA TATTATTTAA GTTACAAGAA 1260 

ATTAATAAAC TTAGAGAGGT TAGAGAAATA AATCTTGGAG GCGGTTTTAT GAAATTATTT 1320 

ATGGAAAATC GTTTGAAAGA ATTTTTTCTA TCACTTATGG AAATCTATAA AAAGTACGAT 1380 

ATTGATAGTA CTGTGACTAC AATAATAGAA CCAGGTACTG CAATTACTTC ATTTTCTGCC 1440 

TATATGATTA CTAGCCCAGT TAATGTTAGT GAGGTGAATG AGCAGCAGGT TATCACGTTA 1500 

GACACATCAA TATACACCAA TACATTATGG TTTGTTCCGC AT ATT ATT AC AACGTTAAAT 1560 

TCAAGTAGTA AAGAGCGTTA TAGTACTATT CTCTATGGTA ATACCTGTTA TGAACATGAC 1620 

AAGTATAAAA TGAAAGTTTC GCTTCCAAGG TTAACTCAAA ATAGCAGTAT AGTGTTTTTT 1680 

CCTGTAGGAG CTTATATAAA AAGCAATCAT TCAAATTTAC ATCGTAATGA TTTTATGCGG 1740 

GAGGTATATT TGTGGACAAA AAACTTGACA TATTAGATAA AGTTAAGGAA TATTTAGGAA 1800 

ATAAAACTAC TCAAATTCTG GATAATCAAT ATAAAGAATT TTTGAAACTT AATGATATAA I8 60 
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GGCGAGCGTT TGGTATTTCA GAAAAAGTAT TAAACAATTC TTTTAATTTT ACGAGTAAAG 1920 

AATTTAATGA TTTAATTAAT AACGAAAATT ATTTATTCGA ATATGCATGT AGAATTAGAG 1980 

AGGAATGGAG AAAAAAATGC TTTAATCATT CTTATCGTTT TCTATGCTCA CCTATAATTA 2040 

CAGATGATTT TCTTAACACG AAGACATTGA GAAGTAGCCA AATTGAATAT AAATATGAGC 2100 

GATATTTATC GAAAAGTTCG ATAGGCGATA GAGCGGTTGA TGGCTTTGTT TCCTTCAATA 2160 

CTTTAACAGC TAATGGTATG TCTGCTATTA AACTATGTCT TG AG AT ATT A AACTCTATTT 22 20 

TCTTCAAGAA GAAGATTGAT TTATTATATT CAACCGGATA TTATGAAACA AGATTTTTAT 2280 

TAAATAATCT TGCTAAATCA GGTATTAGTT GCTATGAGGT AAGTAATTGT GAATTGGATA 2340 

AAGATAAATT TTATAATGTA TTCATGATGG AACCCAATCG AGCCGATTTA AC ATTAC AAA 2400 

AAACTGATTT CAAGATAGTA GAATATTTTG TTAAGTATAA AAATAATTCA ATAAAAGTCG 2460 

TTATTTTAGA TATTTCATAT CAAGGTTCTA ATTTTAAATT AGTAGAATTT TTAGAGAAAT 2520 

TTAAATTTGC GAATGTAATT ATTTTTGTGG TACGATCTTT GATAAAATTA GATCAAATCG 2 580 

GATTAGAATT GACAAATGGG GGAATAATAG AAGTGTTTAT TCCTAATCAT TTGAGAAAGT 2640 

TGAAAAATTT TATTGAAGAG GAATTCAATA AATTTAGAAA TTCTCACGGA GCTAATCTAA 2700 

GCCTCTATGA ATACTGTTTG CTTGATAATT CTTTAACTTT AAAAAATGAT TGGAACTATT 2760 

CTGATTTAGT TATGAAATTT ACGAGTAATT TTTATCCTGA TATAAAAGAC TTGTTCATGG 2820 

AAAATTCTGA TATTGAAATC ATCCATGAAG AGGGAGTACC TTTTCTATTT TTAGATTTAA 2880 

TAGGTGAAGG TAAAAAAGAA TATGAAATCT TTTTTCAATG GTTAAACTTC TTTTACAAAC 2940 

AGCTTGGAAT CACATTGTAT GCTAGAAATA GTTTTGGGTT TCGCAATCTA ACAGTAGAGT 3000 

ATTTTGGAAT TATTGGGACA GAAAGATATA TATTTAAGAT TTGTCCAGGT GTTTATAAAG 3060 

GGTTAAGTTA TTATTTGATG AAATTTTTAT TAAAATCTTT TTCAAATCAA TATTTAAAAA 3120 

CTACTGATGA GGTTAATAGA TGAAAAATTT GATAAAGTTG CTAATAATTA GATTGATTGT 3180 

TAACTTAGCA GACAGTGTAT TTTATATAGT AGCATTGTGG CACGTTAGCA ATAATTATTC 3240 

TTCGAGCATG TTCTTAGGAA TATTTATTGC AGTAAATTAT CTACCGGATT TGTTACTAAT 3 300 

CTTTT TT GGA CCAGTTATTG ACAGAGTAAA TCCGCAAAAA ATTCTTATAA TATCAATTTT 3360 

GGTTCAATTA GCAGTGGCTG TAATATTTTT ATTATTATTA AACCAAATAT CATTTTGGGT 3420 

GATAATGAGT CTAGTGTTTA TTTCAGTAAT GGCTAGCTCC ATAAGTTACG TGATAGAAGA 3480 

TGTGTTGATT CCTCAAGTGG TAGAATATGA TAAGATTGTA TTTGCAAATT CTCTTTTTAG 3540 

TATTTCGTAT AAAGTATTAG ATTCTATTTT TAATTCATTC GCATCATTTT TACAGGTGGC 3600 
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AGTAGGATTT ATTTTATTGG TTAAGATAGA TATAGGCATA TWF=.\CTTG CTCTATTTAT 3660 

ATTGTTGTTG TTAAAATTTA GAACTAGCAA TGCGAATATA GAAAACTTCT CTTTCAAATA 3720 

TTACAAGAGA GAAGTGTTGC AAGGTACAAA GTTTATTTTA AATAATAAA7 TATTATTTAA 37 80 

AACCAGTATT TCTTTAACGC TTATAAACTT TTTTTATTCA TTTCAGACAG TAGTTGTACC 3840 

GATTTTTTCT ATTCGATATT TTGATGGTCC GATTTTTTAT GGTATTTTTT TAACTATTGC 3900 

TGGTTTGGGT GGTATATTGG GAAATATGCT AGCGCCAATC GTAATAAAAT ATTTAAAATC 3960 

GAATCAAATT GTTGGTGTAT TTCTTTTTTT GAACGGCTCA AGTTGGTTAG TAGCAATTGT 4020 

TATAAAAGAC TATACTTTAT CACTTATTTT ATTTTTCGTT TGTTTTATGT CTAAAGGAGT 4080 

CTTCAATATT ATTTTTAATT CGTTGTACCA ACAAATACCT CCACATCAAC TTCTTGGTAG 4140 

GGTAAATACT ACCATTGATT CTATTATTTC TTTTGGAATG CCAATTGGTA GTTTAGTTGC 4200 

AGGAACGCTT ATTGATTTGA ATATTGAATT AGTGTTAATT GCTATTAGCA TACCTTATTT 4260 

TTTCTTTTCT TATATTTTTT ATACGGATAA TGGATTGAAA GAATTTAGTA TATATTAGAA 4 320 

ATGTTTATGT TCATTCAAAA GCATAATCAC TATAACTGAA AAAGAAAAGT GATATCTTTA 4 380 

AGGTTGTTCT TCTTGGTGGT GAGATTCGTG AGACAACCCA AGCTTTTGTC GGAAAGATTA 4440 

CCAATGCTTT GATGGATAGG ATGTACTTTA GCAAGATGTT TTTACTGGTA ACGGTATCGT 4500 

GGATGGACGT GTAATAACCT CTTCTTTCGA GGAGTATTTT ACTAAAAAAC TAGCCTTGGA 4560 

GCGTTCCCCA GAAACGGACT TACTCATTCA CTCTTCAAAG ATTTGGGGAG AAGATTTTGC 4620 

TTCATCTGTT CCTTGAAAAA AGTCACAGCA GTCATCACAG ACGATAGTAC TGAACAAAAC 4680 

TATGAAGAGT TAGAAATTTA TACGCAGCTG ATTGTATAAA GGATCTGGAA ATAGATAAGA 4740 

AGTTGATTAG TATTGACCTA GGTGGTACAA ATATTAAGAT TACTGTTCTT TCAAATGACC 4800 

GTGAGATTGA AACTTTGTGG AGTATTACAA CAGATACAAG TGAGAAAGGT TCTCAAATTA 4860 

TATCGGACAT CATCAGTTCT ATTAAAAATA AATTGACCGA ACGGAATATT CCTGATAGCG 4 920 

ACCTTC TT GG AATCGGTATG GGAAGTTGCT CATCATACTT TCCTTGTAAA TCATAGGGGC 4980 

TATAAACTCT CCGTCTACTT GTCCTGCAAC AATTGAAGTC TCCTCAAAAC GCCGTCCGCT 3040 

AATCTTTTCA TAGACTTTCT CCCTTTTAGG AGCCTAGCTT TCTAGTTTGT TCTTTGATTT 5100 

TTATTGAGTA TACCACTATT TTACTCCCTC TGGCAAGGGA CTTTGTCTAT GTGGAGGGAT 5160 

TGCGCTCCTA TGTGGTGGAG CTTTTCTGTT CTTTCTGAAA TATGGTATAA TAGCACTAAT 5220 

CAATTTCTAG GAAAATAGAT ACAGAAAGGG GCTGAAAGAT GTCTCATATT ATTGAATTGC 5280 

CACACATGCT GGCAAACCAA ATCGCGGCTG GAGAGGTCAT TGAACGTCCT GCCAGTGTGG 5340 

TCAAAGAGTT GGTAGAAAAT GCCATTGACG CGGGCTCTAG TCAGATTATC ATTGAGATTG 5400 
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AGGAAGCTGG TCTCAAGAAG GTTCAAATCA CGGATAACGG TCATGGAATT GCCCACGATG 54 60 

AGGTGGAGTT GGCCCTGCGT CGCCATGCGA CCAGTAAGAT AAAAAATCAA CCAGATCTCT 5520 

TTCGGATTCG GACGCTTGGT TTTCGTGGTG AAGCCTTGCC TTCTATTGCG TCTGTTAGTG 5580 

TCTTGACTCT GTTAACGGCG GTGGATGGTG CTAGTCATGG AACCAAGTTA GTCGCGCGTG 5640 

GGGGTGAAGT TGAGGAAGTC ATCCCAGCGA CTAGTCCTGT GGGAACCAAG GTTTGTGTGG 5700 

AGGATCTCTT TTTCAACACG CCTGCCCGTC TCAAGTATAT GAAGAGCCAG CAAGCGGAGT 57 60 

TGTCTCATAT CATTGATATT GTCAACCGTC TGGGCTTGGC CCATCCTGAG ATTTCTTTTA 5820 

GCTTGATTAG TGATGGCAAG GAAATGACGC GGACAGCAGG GACTGGTCAA TTGCGCCAAG 5880 

CAATCGCAGG GATTTACGGT TTGGTCAGTG CCAAGAAGAT CATTGAAATT GAGAACTCTG 5940 

ACCTAGATTT CGAAATTTCA GGTTTTGTGT CCTTGCCTGA GTTGACTCGG GCTAACCGCA 6000 

ATTATATCAG CCTCTTCATC AATGGCCGTT ATATTAAGAA CTTCCTGCTC AATCGTGCTA 6060 

TTTTGGATGG TTTTGGAAGC AAGCTTATGG TTGGACGTTT TCCACTGGCT GTCATTCACA 6120 

TCCATATCGA CCCTTATCTA GCGGATGTCA ATGTGCATCC AACTAAGCAA GAGGTGCGGA 6180 

TTTCCAAGGA AAAAGAACTG ATGACTCTGG TTTCAGAAGC TATTGCAAAT ACTCTCAAGG 6240 

AACAAACCTT GATTCCAGAT GCCTTGGAAA ATCTTGCCAA ATCGACCGTG CGCAATCCTG 6300 

AGAAGGTGGA GCAAACTATT CTCCCACTCA AAGAAAATAC GCTCTACTAT GAGAAAACTG 6360 

AGCCGTCAAG ACCTAGTCAA ACTGAAGTAG CTCATTATCA GGTAGAATTG ACTGATGAAG 6420 

GGCAGGATTT GACCCTGTTT GCCAAGGAAA CCTTGGACCG ATTCACCAAG CCAGCAAAAC 6480 

TGCATTTTCC AGAGAGAAAG CCTGCTAACT ACGACCACCT AGACCATCCA CAGTTAGATC 6540 

TTGCTAGCAT CGATAAGGCT TATGACAAAC TGGACCGAGA AGAAGCATCC AGCTTCCCAG 6600 

AGTTGGAGTT TTTCGGACAA ATGCACGGGA CTTATCTCTT TGCCCAAGGG CGAGATGGAC 6660 

TTTACATCAT AGATCAGCAC CCTGCTCAGG AACGGGTCAA GTACGAGGAG TACCGTGAAA 6720 

GCATTGGCAA TGTTGACCAA AGCCAGCAGC AACTCCTAGT GCCCTATATC TTTGAATTTC 6780 

CTGCGGATGA TGCCCTGCGT CTCAAGGAAA GAATGCCTCT CTTAGAGGAA GTGGGCGTCT 6840 

TTCTAGCAGA GTACGGAGAA AATCAATTTA TTCTACGTGA ACATCCTATT TGGATGGCAC 6900 

AAGAAGAGAT TGAATCAGGC ATCTATGAGA TGTGCGACAT GCTCCTTTTG ACCAAGGAAG 6960 

TTTCTATCAA GAAATACCGA GCAGAGCTGG CTATCATGAT GTCTTGCAAG CGATCTATCA 7020 

AGGCCAATCA TCGTATTGAT GATCATTCAG CTAGACAACT CCTCTATCAG CTTTCTCAAT 7080 

CTGACAATCC CTATAACTGT CCTCACGGAC GTCCTGTTTT GGTGCATTTT ACCAAGTCGG 7140 
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ATATGGAAAA GATGTTCCGA CGTATTCAGG AAAATCACAC CAGTCTCCGT GAGTTGGGGA 7200 

AATATTAAAA GTATAAAAAA GTCTGGGAAA AATTTTCAAA ATCAAAAAAA CGCATAAAAT 7260 

CAGGTGTTCA AAAACCTTGA TTTTATGCGT TTTATCATGG AAATAGTTAC TTCATTTTTT 7320 

CCTAATTCTT TTCGAAACTC TTTTTAAACG ACGTCAGTTT TATCAGTAAT CTCAAAACAG 7380 

TGTTTTGAGC TAATTTTGCC AGTTTTGTCT GTAACATCGA AGTTCTGTTT TACCACTCTG 7440 

CGACTGGTTT CCTAGTTTGC TCTATGATTT TCACAGAGCA TTAAATTGCG ATTTTGCCAA 7500 

GTTTCTTTAT TCGTCTAAAA GTAGAGTCTG TTCTATGCGT CTAATGTACG AATCAGGTTG 7560 

ACCATTTCAA TAGCTCCTTG TGCACACTCA GAACCCTTAT TTCCTGCTTT AGTACCAGCT 7620 

CCTTCTATCG CTTGTTCAAT TGTATCTCTC GTTAGCACAC CAAACATAAC AGGAATTTCG 7680 

CTATTTAAAC TGATTTGGGC GATTCCCTTA GATACCTCGC TACATACATA ATCATAATGA 7740 

CTTGTATTCC CTCTAATGAC AGCTCCCAAG CAGATAATTG CATCATATTT TTTACTTTTT 7 800 

GCCATTTTTG ATGCAATCAG TGGTATTTCA AAAGCTCCTG GAACCCAGGC TACCTCTATA 7860 

TCTTTCTCGT TTACATTCTC TCTTTTGAGA TTATCTAGTG CTCCAGATAA TAATTTTGAA 7920 

GTTATAAATT CATTAAATCT CGCTACAACA ATACCTATTT TAATATTGTT TGCTACTAAA 7980 

TTACCTTCAT AAGTGTTCAT TTATTTTTCC TCCATATTTA AAATGTGACC CATTCGATTT 8040 

TTCTTTGTTT CTAAATAAAA ACTATCGTAA GGATTGGCTT CTATTTCGAT TGATATTCTA 8100 

CTGGAAATGG TAATTCCATA TTTTTCTAAC TGTTCAACCT TGTCAGGATT ATTTGTCAGT 8160 

AAATGAAGTG ACTGAAGTCC CAGATCTTTA AGCATTTTTG CTCCAATATG ATATTCTCTT 8220 

AAATCACCTT CAAAGCCTAA TGCAAGATTG GCATCAAGCG TATCCATGCC TTGATCTTCT B280 

AAATGATAGG CTTTTAATTT ATTGATAAGT CCAATTCCTC GTCCCTCCTG TCGCAAGTAA 9 340 

AGTAAGACAC CCGAACCATT CTCAACAATC ATTTTCATAG CCTTATCGAA TTGCTGTCCA 8400 

CAATCGCAAC GTAAAGAGCC TAAAACATCT CCTGTTAAAC ATTCGGAGTG GACCCGACAT 84 60 

AATACATTGG CTTCATCCTC TATATTTCCC ATAATAAGAG CAAGATGATG TTCCCCATTT 8520 

ACTTTATCTA TATAGCTAAT TGCTTTGAAA TTACCGTATC TAGTAGGCAT ATTGACAGTT 8580 

GAAACTCGTT CTACCAGCTG ATCATATACT TTTCTATATT CTTGTAATTC TTTGATGGTA 8640 

ATTAGTGGAA TGTTGTGTTT TTTCGAGAAC TGAATTAAAT CATCTGTTCT CATCATTTTG 8 "7 00 

CCATCATGAT TCATTATTTC ACAACATAGG CCACACTCTT TTAGTCCAGC TAATTTTAAT 8760 

AAATCAACAG TTGCTTCTGT GTGTCCATTT CTTTCTAGGA CACCACCTTT TTTTGCAATT 8820 

AAAGGAAACA TGTGTCCTGG CCTGCGAAAA TCAGAGGGTG TTATATCTTC AGCTACACAC 8880 

ATACGTGCGG TCAGTCCTCT TTCCTCGGCA GAAATACCTG TGGTCCTTTC TTTATAATCA 8940 
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ATTGAAACTG 


TAAAAGCACT 


CTTATGATTA 


TCTGTATTGT 


TTTCAACCAT 


AGGTGAAAGC 


9000 


ATTAATTGAT 


TACCTAAACT 


TTCGCTCATA 


GGCATACAAA 


TTAATCCTTT 


GGCATAAGTA 


9060 


GCCATAAAAT 


TAACATTTTC 


TGTTGTAGCT 


GCTTGTGCAG 


AACAAATTAA 


GTCTCCTTCA 


9120 


TTTTCTCTAT 


CCTTGTCGTC 


TATAACAAGA 


ACAAGTCGTC 


CCTTCTGCAA 


PGCTTCTAAT 


9180 


GCTTCTTGTA 


TTTTTCGATA 


TTCCATTGAC 


TGATTATCCT 


TTCTGCTAAA 


ATCCATTTTG 


9240 


ATATAATAGT 


TCCTTAGATA 


TTTCTGATTT 


TGGAGAGTTA 


TCCATCACTT 


TTTGCAGATA 


9300 


TTTACCTAAG 


ATATCATTTT 


CAAGATTTAC 


TGTACTCCCG 


ACTTGTTTAC 


TCTTAAGAAT 


9360 


GGTTTGTTCC 


AAGGTATGAG 


GGATAACAGA 


TACTGAAAAG 


TTTACTTTGG 


AGACTTTAGC 


9420 


GACAGTCAGA 


CTAATGCCGT 


CAATTGTAAT 


ACATCCTTTT 


TCAACTATTA 


AATCTAAAAT 


9480 


TTCTTTTTGT 


GTGTTGATTT 


GATACCATAC 


AGCATTATCA 


TCTTTTTTTA 


TTGACGAGAT 


9540 


TTTTCCTGTA 


CCATCAATGT 


GTCCTGTAAC 


GACGTGACCC 


CCAAGTCCAC 


CCTTGACACA 


9600 


TAAGGCTCTT 


TCTAGATTCA 


CCTCACTTCC 


ATGTTTTAAT 


AGAGTAAGAG 


CTGTTCGACT 


9660 


CCATGTTTCA 


TTCATTACAT 


CAACTGTAAA 


GGATTGATGA 


TTGAAATGAG 


TAACTGTAAG 


9720 


ACAGATACCA 


TTTACTGCTA 


TACTATCGCC 


TAAATGGATA 


TCCGTTAATA 


TTTTTGAGGC 


9780 


TTTAATTGAT 


AGTTTACAAT 


TACGAGAGTC 


TTTCTGTATT 


CTTTCAACTT 


TTCCGATTTC 


9840 


TTCAATTATT 


CCTGTGAACA 


TGGATAAATC 


ACTTCACTTT 


CTATGAGATA 


GTCATTTCCT 


9900 


ATTTGAGAAA 


ATGCATAAGG 


TTTCAATCTA 


ATACCGTCAT 


TTGGCAAAGA 


AATACCTTCA 


9960 


CCTCCGACAG 


CAAACTTGGC 


ACTACCTCCA 


AAAACTTTTG 


GTGCAATATA 


TATTTTCAGC 


10020 


TCATCAACAA 


TTTGTTGTTC 


CAAAGCACTC 


CAATTCATTA 


GACTGCCCCC 


TTCTAGAACT 


10080 


AGGCTATCAA 


TCTGCATGTT 


TCCTAGATGT 


TGCATTAAAC 


TCGATAAGTC 


TATATGATTG 


10140 


CCTTTTTTCT 


TTATGGAAAG 


TATTTCACAG 


CCATGATTTT 


GATATAGCTT 


CATTTTATTT 


10200 


TTGTCTTCAG 


AGGAAGTGGC 


AATGTAAGTT 


TTAATATCAT 


TTGCTGTTTT 


TACGATTTTA 


10260 


GAGGTAAGAG 


GAGTTCGTAA 


* 

ATGTGTATCG 


CATATGATAC 


GGATAGCATT 


TTTCCCTTCC 


10320 


TCCAATCTAC 


ATGTCAGCAA 


AGGATCGTCT 


TGAATAACAG 


TATTGACTCC 


CACCATAATT 


10380 


GCACTAACAT 


GGTGTCGTAA 


CTGATGCACA 


TGCTTTCTTG 


CTTCTTCTTC 


AGTAATCCAT 


10440 


TTGGATTGAT 


TTGTTTTAGT 


GGCTATTTTT 


CCATCCATTG 


ACATTGCATA 


TTTCATAAAA 


10500 


ACATAGGGTA 


CATGCTGGGT 


AATATACTTT 


CTAAAACTTT 


TTATTAAGTT 


AAGACACTCA 


10560 


TTTTCTAAAA 


TTCCAACAGT 


AACTTGAAGA 


TTATTTTCCT 


CAAGTATCTT 


TACTCCTTTT 


10620 


CCAGATACAA 


TAGGATTACA 


GTCTAGGCTT 


CCAATGACTA 


CTCTTGTAAT 


ACCACTATCG 


10680 
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ATTATAGCAT 


CTATACAGGG 


AGGTGTTTTC 


CCGAAGTGAC 


AACAGGGTTC 


AAGTGTTACA 


10740 


TAAAGCGTCG 


CTCCGACAGG 


GGATTCTCTA 


CAGTTTTTAA 


GAGCATTTCT 


CTCAGCATGT 


10800 


GGGCCACCAA 


AAAACTCATG 


ATAACCTTGT 


CCGATAATGT 


GATTATCTTT 


TACAATAACT 


1Q860 


GCGCCGACCA 


TAGGATTGGG 


ATTGACGTAA 


CCAGCCCCTT 


TTTGTGCCAG 


TTTTATTGCT 


10920 


AATTTCATAT 


ATTTTGAATC 


GCTCATCTCG 


CTACCTCCAA 


AAAAATATAC 


CTTGAATAGG 


109BO 


GGACTACTCA 


AGGCATACAA 


AAGAAAACTT 


ATGCGATTAA 


CAAAAATGCT 


CTGAAATGAC 


11040 

* m* mf ^ w 


AAGTAATCAT 


TTCAGAGCAC 


GCAAAAAGCA 


CAAATATACT 


TTTATCTTCT 


TTCATCCAGA 


11100 


CTATACTGTC 


GGCTTTGGAA 


TTTCACCAAA 


TCATGCCTTT 


CGGCTCGTGG 


GCTATACCAC 


11160 


CGGTAGGGAA 


TTTCACCCTG 


CCCTGAAGAT 


AGTTATTCAA 


TTACAGATGA 


TTATAGTACT 


11220 


TAATTTTGAA 


TATGTCAACA 


GATAAATACC 


GATTGTTTTT 


GATATACTGT 


ATTTGTGATA 


11280 

4V 4\ mm W 


ATCGATTCTC 


GCTCCTCGGA 


TAAAGAAAAT 


ATGATATACT 


AGATAAACGA 


AATAAGAGAG 


11340 


AAGGAATACT 


ATGTACGCAT 


ATTTAAAAGG 


AATCATTACC 


AAAATTACTG 


CCAAATACAT 


111 uu 


TGTTCTTGAA 


ACCAATGGTA 


TTGGTTATAT 


CCTCCATGTG 


GCCAATCCTT 


ATGCCTATTC 


1 14 fiO 


AGCTCAGCTT 


AATCAGGAGG 


CTCAGATTTA 


TGTGCATCAG 


GTTGTGCCTG 


AGGACGCCCA 


1 1 S?fi 


TTTGCTTTAT 


GGATTTCGCT 


CAGAGGATGA 


GAAAAACCTC 


TTTCTTAGTC 


TGATTTCGGT 


1 1 Sflft 
X X J O u 


CTCTGGGATT 


GGTCCTGTAT 


CAGCTCTTGC 


TATTATCGCT 


GCTGATGACA 


ATGCTGGCTT 


1 1640 


GGTTCAAGCC 


ATTCAAACCA 


AGAACATCAC 


CTACTTGACC 


AAGTTCCCTA 


AAATTGGCAA 

• mm mm mm* m ^^i^tmr^m 9 mm^ » 


11700 


GAAAACAGCC 


CAGCAGATGG 


TGCTGGACTT 


GGAACGCAAG 


GTAGTACTTG 


CAGGAGATGA 


11760 


CCTTCCTGCC 


AAGGTCGCAG 


TGCAAGCAAG 


TGCTGAAAAC 


CAAGAATTCG 


AAGAAGCTAT 


11820 


GGAAGCCATG 


TTGGCTCTGG 


GCTACAAGGC 


AACAGAGCTC 


AAGAAAATCA 


AGAAATTCTT 


11880 


TGAAGGAACG 


ACAGATACAG 


CTGAGAACTA 


TATCAAGTCG 


GCCCTTAAAA 


TGTTGGTCAA 


11940 


ATAGGAGCAG 


AGAATGACAA 


AACGTTGTTC 


GTGGGTCAAG 


ATGACCAACC 


CGCTCTACAT 


12000 


CGCCT A Tf* AT 
uu\.\. 1 1\ l L n 1 




GGGGCCAGCC 


CCTCCATCAT 


GACCAAGTAT 


TGTTTGAGTT 


12060 


GTTGTGTATG 


GAAACCTATC 


AGGCAGGCCT 


GTCTTGGGAA 


ACGGTACTCA 


ACAAACGCCA 


l2120 


AGCTTTCCGA 


GAAGTCTTTC 


ATAGCTATCA 


AATTCACTCA 


GTCGCAGAGA 


TGACTGACAC 


12180 


TGAATTGGAA 


GCCATGCTGG 


AGAATCCAGC 


TATCATTCGA 


AATAGAGCCA 


AGCTTTTTGC 


12240 


TACACCCGCT 


AACGCCCAAG 


CCTTTCTACA 


GTTACACGCA 


GAGTACGGCT 


CTTTTGATGC 


12300 


CTATCTTTGG 


TCTTTTCTTG 


AGGGGAAAAC 


TGTCGTTAAC 


GATGTTCCTG 


ATTATCGCCA 


12360 


AGCGCCAGCT 


AAAACACCCT 


TATCTGAGAA ATTAGCCAAA 


GATCTCAAAA 


AACGAGGCTT 


12420 


CAAGTTCACA 


GGCCCAGTCG 


CCGTATTGTC 


TTTTCTACAG 


GCTGCAGGGC 


TAGTTGATGA 


12480 
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CCACGAGAAT 


GATTGTGAGT 


GGAAAGGTCT 


TAAATGATGT 


CTAACAAAAA 


TAAGGAAATT 


12540 


CTGATTTTTG 


CGATTCTCTA 


TACAGTCCTC 


TTTATGTTTG 


ATGGCGTTAA 


ATTGCTGGCT 


12600 


TCTTTAATGC 


CATCTCCCAT 


TGCAAATTAT 


CTTGTTTATG 


TAGTTTTAGC 


TCTATATGGC 


12660 


TCCTTCTTGT 


TCAAGGATAG 


ATTGATCCAA 


CAATGGAAGG 


AGATTAGAAA 


GACTAAAAGA 


12720 


AAATTCTTCT 


TTGGACTCTT 


AACAGGATGG 


CTCTTTCTCA 


TTCTGATGAC 


TGTTGTCTTT 


12780 


GAATTTGTAT 


CAGAGATGTT 


GAAGCAGTTT 


GTGGGACTAG 


ATGGACAAGG 


TCTAAATCAG 


12840 


TCTAATATTC 


AAAGTACCTT 


TCAAGAACAA 


CCACTACTGA 


TAGCTCTTTT 


TGCTTGTGTC 


12900 


ATTGGACCTC 


TGGTAGAAGA 


ATTATTTTTC 


CGTCAGGTCT 


TATTGCATTA 


CTTCCAGGAA 


12960 


CGGTTGTCAG 


GTTTACTAAG 


CATTATTCTG 


GTAGGACTTC 


TTTrrGCTCT 


GACTCATATG 


13020 


CACAGTTTGG 


CTCTATCAGA 


GTGGATTGCT 


GCAGTTGGTT 


ACTTAGGTGG 


AGGCCTTGCC 


13080 


TTTTCTATTA 


TTTATGTGAA 


AGAAAAAGAG 


AATATCTACT 


ATCCCCTACT 


TGTTCACATG 


13140 


TTAAGCAACA 


GCCTCTCCTT 


AATCATTTTA 


GCTATCAGTA 


TAGTAAAATG 


AAATGAGAAC 


13200 


AGGACAAATC 


GATTTCTAAC 


AATGTTTTAG 


AAGTAGAGGT 


GTACTATTCT 


AGTTTCAAT A 


13260 


TACTGTAATA 


TGTGATGAAA 


ATGCCAGTAA 


TGATACCGAG 


AAAAAAGCTG 


AGAAACTTTT 


13320 


CCCAGCTTTA 


TTTGTTATAG 


TCAAAGAGAA 


TGACTTGTTC 


CTGTGCATCT 


ACATGAGCAT 


13380 


GGACCCCAAA 


GGGTACAATT 


GCTCTTGGAG 


TTGCGTGGCC 


GACATTCAGA 


TTATAGACAA 


13440 


TCGGGATATT 


GCTGTCAATG 


ATATCCAATA 


GTGCCTCTTT 


ATACTCGTCA 


TGGAAAGTTT 


13500 


CATCCATAGG 


TTTTCCGACC 


AAGAGTCCAT 


TCATGACCCC 


GAATATGCCA 


GTGTCCTTTA 


13560 


AAGTTAGCAA 


CATCTTTTTG 


AAGTCTTCTG 


GCTTAGGCTT 


TTCTTCGCTT 


GTTTCGAGCA 


13620 


AGAGGATTTT 


CCCTTCCCAC 


TCTCAC.WGT 


CAGGSAAAAG 


i . . J . . V . . ^ . 


7CGCACAGTT 


13530 


CCGTGCTATC 


TGCGTATCGA 


GAGTTGTCAA 


AGATATCGTA 


GAGGGATTCG 


AGGCAACCAC 


13740 


CGAGGATTTT 


CCCCTCGAAC 


TGGGCACTTC 


CTTGCAACAA 


GTCAAAACCT 


GTATTTGTAT 


13800 


GACTGACACG 


AGCTGTTCCC 


AGGGCCGTGG 


GACTAAAATC 


AGTTCGTTCC 


TCATACCAAA 


13860 


CGTCACTAGG 


GCGGATTTCT 


GAAATTCTTC 


CCGTCTCAAT 


CAATTCTTTA 


AAGTAGTGAA 


13920 


GGCTATAGGC 


TAGCATTTCT 


TTGTCTAATT 


CACAAATGTC 


TGCTAAAAAG 


GATTGACCAT 


13980 


AAAAAGTCTT 


GATTCCTAAT 


TTATGCAACA 


TGAGGTCGTT 


CATGGTTGTA 


TCCGAGAAGC 


14040 


CAAGAAAAAT 


TTTTTGCTTG 


ATAACCTTTT 


GGAGTTGGTC 


ATTTTCAAAA 


AGATAAGGTA 


14100 


GCAAGCGATA 


GCTATCGTCT 


CCACCGATGG 


CACATAGGAT 


CATGTCGATG 


CTATCATCAG 


14160 


AAAAGGCATG 


AATCAAATCC 


TCTGCACGAG 


CTTCAGGATG 


GTCCTTGATA 


AAGTCTAATC 


14220 
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CTTTTAACGA ATGGGGCAAA AAGATGGGAT TGGTCCCAGA TCCTTGAGAC GTT 14273 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 9828 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



txi) SEQUENCE DESCRIPTION : SEQ ID NO: 41: 

GTGAAGTGCG GCAAAAGGTG CAAGTGATGA GCTCAGGTTC TTTAGCTCTT GACATTGCCC 60 

TTGGCTCAGG TGGTTATCCT AAGGGACGTA TCATCGAAAT CTATGGCCCA GAGTCATCTG 120 

GTAAGACAAC GGTTGCCCTT CATGCAGTTG CACAAGCGCA AAAAGAAGGT GGGATTGCTG 180 

CCTTTATCGA TGCGGAACAT GCCCTTGATC CAGCTTATGC TGCGGCCCTT GGTGTCAATA 240 

TTGACGAATT GCTCTTGTCT CAACCACACT CAGGAGAGCA AGGTCTTGAG ATTGCGGGAA 300 

AATTGATTGA CTCAGGTGCA GTTGATCTTG TCGTAGTCGA CTCAGTTGCT GCCCTTGTTC 360 

CTCGTGCGGA AATTGATGGA GATATCGGAG ATAGCCATGT TGGTTTGCAG GCTCGTATGA 4 20 

TGAGCCAGGC CATGCGTAAA CTTGGCGCCT CTATCAATAA AACCAAAACA ATTGCCATTT 4 80 

TTATCAACCA ATTGCGTGAA AAAGTTGGAG TGATGTTTGG AAATCCAGAA ACAACACCGG 540 

GCGGACGTGC TTTGAAATTC TATGCTTCAG TCCGCTTGGA TGTTCGTGGT AATACACAAA 600 

TTAAGGGAAC TGGTGACCAA AAAGAAACCA ATGTCGGTAA AGAAACTAAG ATTAAGGTTG 660 

TAAAAAATAA GGTAGCTCCA CCCTTTAAGG AACCCGTAGT TGAAATTATG TACGGAGAAG 720 

CAATTTCTAA GACTGGTGAG CTTTTGAAGA TTGCAAGCCA TTTGGATATT ATCAAAAAAG 780 

CAGGGGCTTG GTATTCTTAC AAAGATGAAA AAATTGGGCA ACGT7CTGAG AATGCTAAGA 84 0 

AATACTTGGC AGAGCACCCA GAAATCTTTG ATGAAATTGA TAAGCAAGTC CGTTCTAAAT 900 

TTGGCTTGAT TGATGGAGAA GAAGTTTCAG AACAAGATAC TGAAAACAAA AAAGATGAGC 960 

CAAAGAAAGA AGAAGCAGTG AATGAAGAAG TTCCGCTTGA CTTAGGCGAT GAACTTGAAA 1020 

TCGAAATTGA AGAATAAGCT GTTAAAGCAG TGCAGAAATC CGCTACTTTT TCGATTTTTG 1080 

ATTCAAGTTT TTAGATTATA TATAGTAGCT TGAAATAAGA TATGAACAAC TCTATTAGGA 1140 

AAGTCAAATT AATTTCTAGA AATGTTTTAG CAGCTACAGC GTACTATTCC AAACTCAACC 1200 

AACTATAATA GATCGAAACT AGAATAGTAC ATATCTACTT CTAAAACATT GTTAAAAATC 1260 

GATTTGACTT TCCTTATTTC ATTCCGCTAT ATATAGTTTG CTGTTTCTTG TCCCTCCTCT 1320 

GGAAAGCTGA TATAATAGCT TTATGAATAA AAAACGAACA GTGGACCTGA TACATGGTCC 1380 
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GATTCTTCCC TCGCTCTTAA GCTTCACCTT TCCAATTTTG CTATCAAATA TTTTTCAACA 1440 

GCTCTATAAC ACTGCTGATG TCTTGATTGT TGGACGATTT CTTGCTCAAG AATCCTTGGC 1500 

TGCAGTAGGA GCGACGACAG CGATTTTTGA CCTGATTGTA GGTTTTACAC TTGGTGTTGG 1560 

CAATGGCATG GGGATTGTCA TTGCTCGTTA TTATGGGGCT CGCAATTTCA CTAAAATCAA 1620 

GGAAGCAGTA GCAGCCACCT GGATTTTAGG TGCTCTTTTG AGCATTCTAG TTATGTTGCT 1680 

GGGCTTTCTT GGCTTGTATC CTCTCTTGCA ATACTTAGAT ACTCCTGCAG AAATTCTTCC 1740 

TCAATCTTAT CAATATATTT CTATGATTGT GACCTGTGTA GGTGTCAGCT TTGCTTATAA 1800 

TCTTTTTGCA GCCTTGTTCC GGTCTATTGG TGACAGTCTA GC AG CCCTGG GATTTCTGAT 1360 

TTTCTCTGCC TTGGTTAATG TGGTTCTGGA TCTCTATTTT ATTACGCAAT TCCATCTGGG 1920 

AGTTCAATCC GCAGGACTTG CTACCATTAT TTCGCAAGGT TTATCAGCGG TTCTCTGCTT 1980 

TTATTATATT CGTAAAAGTG TGCCAGAACT CTTGCCACAG TTTAAACATT TCAAATGGGA 2040 

CAAAAGCTTG TACGCGGATC TCTTGGAGCA AGCTTTGGCT ATGGGCTTGA TGAGTTCAAT 2100 

TGTATCTATC GGCAGTGTGA TTTTACAGTT TTCTGTTAAT ACATTTGGTG CAGTGATTAT 2160 

TAGTGCCCAG ACGGCAGCTC GACGCATTAT GACCTTTGCC CTTCTTCCTA TGACCCCTAT 2220 

TTCTGCATCA ATGACGACCT TTGCTTCTCA GAATCTAGGA GCTAAGCGAC CTGACCGTAT 2280 

TGTTCAAGGT CTTCGAATCG GCAGTCGTTT AAGTATATCC TGGGCAGTTT TTGTTTGTAT 2340 

TTTCCTCTTT TTTGCCACTC CAGCTTTGGT TTCCTTCTTG GCTAGTTCGA CAGATGGTTA 2400 

CTTGATAGAA AATGGAAGTC TCTATCTGCA AATCAGTTCA ACCTTTTATC CCATTTTGAG 24 60 

CCTCTTGTTG ATTTATCGCA ATTGCTTGCA GGGCTTGGGG CAAAAGATCC TTCCTCTAGT 2520 

TTCTAGCTTT ATTOAACTAA TCGGAAAAAT CCTTTTTGTG GTTTT jATTA TTCCTTCGGC 2580 

AGGATATAAG CGTGTTATCC TTTGTGAACC TCTTATCTGG GTTGCCATGA CAGTTCAACT 2640 

GTACTTCTCA TTATTCCGTC ATCCCTTGAT AAAAGAAGGC AAGGCAATCT TGGCAACCAA 2700 

AGTGCAATCC TAGTTGGATT TACTGAATAA AATCCATTTC CTCTAGTGAA AATCGAAAAA 2760 

ACTTGTGTTC TCTTCTTTAG TTTGGTGTTG AAAATAGTTT AACAGACTTT TCACTTCTTT 2320 

TATATGATAT AATAAAGTAT AGTATTTATG AAAAGGACAT ATAGAGACTG TAAAAATATA 2880 

CTTTTGAAAA TCTTTTTAGT CTGGGGTGTT ATTGTAGATA GAATGCAGAC CTTGTCAGTC 2940 

CTATTTACAG TGTCAAAATA GTGCGTTTTG AAGTTCTATC TACAAGCCTA ATCGTGACTA 3000 

AGATTGTCTT CTTTGTAAGG TAGAAATAAA GGAGTTTCTG GTTCTGGATT GTAAAAAATG 3060 

AGTTGTTTTA ATTGATAAGG AGTAGAATAT GGAAATTAAT GTGAGTAAAT TAAGAACACA 3120 
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TTTGCCTCAA 


GTCGCCGTGC 


AACCATATAG 


GCAAGTACAC 


GCACACTCAA 


CTGGGAATCC 


3180 


GCATTCAACC 


GTACAGAATG 


AAGCGGATTA 


TCACTGGCGG 


AAAGACCCAG 


AATTAGGTTT 


3240 


TTTCTCGCAC 


ATTGTTGGGA 


ACGGTTGCAT 


CATGCAGGTA 


GGACCTGTTG 


ATAATGGTGC 


3300 


CTGGGACGTT 


GGGGGCGGTT 


GGAATGCTGA 


GACCTATGCA 


GCGGTTGAAC 


TGATTGAAAG 


3360 


CCATTCAACC 


AAAGAAGAGT 


TCATGACGGA 


CTACCGCCTT 


TATATCGAAC 


TCTTACGCAA 


3420 


TCTAGCAGAT 


GAAGCAGGTT 


TGCCGAAAAC 


GCTTGATACA 


GGGAGTTTAG 


CTGGAATTAA 


3480 


AACGCACGAG 


TATTGCACGA 


ATAACCAACC 


AAACAACCAC 


TCAGACCACG 


TTGACCCTTA 


3540 


TCCATATCTT 


GCTAAATGGG 


GCATTAGCCG 


TGAGCAO-ITT 


AAGCATGATA 


TTGAGAACGG 


3600 


CTTGACGATT 


GAAACAGGCT 


GGCAGAAGAA 


TGACACTGGC 


TACTGGTACG 


TACATTCAGA 


3660 


CGGCTCTTAT 


CCAAAAGACA 


AGTTTGAGAA 


AATCAATGGC 


ACTTGGTACT 


ACTTTGACAG 


3720 


TTCAGGCTAT 


ATGCTTGCAG 


ACCGCTGGAG 


GAAGCACACA 


GACGGCAACT 


GGTACTGGTT 


3780 


CGACAACTCA 


GGCGAAATGG 


CTACAGGCTG 


GAAGAAAATC 


GCTGATAAGT 


GGTACTATTT 


3840 


CAACGAAGAA 


GGTGCCATGA 


AGACAGGCTG 


GGTCAAGTAC 


AAGGACACTT 


GGTACTACTT 


3900 


AGACGCTAAA 


GAAGGCGCCA 


TGGTATCAAA 


TCCCTTTATC 


CAGTCAGCGG 


ACGGAACAGG 


3960 


CTGGTACTAC 


CTCAAACCAG 


ACGGAACACT 


GCCAGACAAG 


CCAGAATTCA 


CAGTAGAGCC 


4020 


AGATGGCTTG 


ATTACAGTAA 


AATAATAATG 


GAATGTCTTT 


CAAATCAGAA 


CAGCGCATAT 


4080 


TATTAGGTCT 


TGAAAAAGCT 


TAATAGTATG 


CGTTTTCTTG 


TGGAGATATT 


TCCTTCAATT 


4140 


TTGCTACTAT 


ATTAAACAAA 


AATCAAAAAG 


CAAACTAGAA 


AGTTATGCTC 


AAATAAAATC 


4200 


TAAATTTGAC 


AATGTAAACC 


GAGTCGGATA 


GCTTTAAGTA 


CTCTTTTGAG 


GTTGAAGATA 


4260 


CGATTTTTGA 


TAGGAACTCA 


TCAATTTTAG 


ATTTTTAAGC 


AGCATCAATA 


AATTGCTTCC 


4 3 20 


TTGTTTTGTC 


ATAATTTTTT 


TATTTAAAAA 


ATTATGACma 


GAGTCTGCTA 


TTCTTTTTAT 


4380 


GAGAGGTGTA 


TGAATATGAT 


AAATGTATGT 


GATAAATGTA 


TGTGATGTTG 


GAAAAAGAAT 


4440 


AAAAGAACTT 


AGAATATCTT 


CAAATCTTAC 


TCAAGATAAG 


ATTGCTGACT 


Al I I brLT. 1 




GAATCAAAGC 


ATGATTGCCA 


AAATGGAAAA 


AGGTGAAAGG 


AATATCACGA 


ATGGATTTAA 


4560 


CTAATAAAGC 


TTCAAATCTT 


AGAAAAAAGT 


TGGGAGCTGA 


TGGTGAATCG 


CCGATAGATA 


4620 


TTTTTAAATT 


GGTACAAAAG 


ATAGAAAATT 


TGACGCTGGT 


ATTTTATGGA 


CTCGGAAAGA 


4680 


ATATTAGCGG 


AGTCTGTTAT 


AAAGGAACTC 


ACTTCAGTCT 


CATTGCAGTC 


AATTCAGACA 


4740 


TGCCATTAGG 


AAGGTAAAGA 


TTTTCTTTAG 


CACATGGACT 


GTATCATCTT 


TATTATGATG 


4800 


AGGTGAAGAA 


GAGTTCACTC 


AGTCTTATCT 


TGATTGGTGA 


AGGAGATGAA 


ACTGAAAGAA 


4860 


AAGCGGATCA 


GTTTGCTTCT 


TATTTTTTAA 


TTTTCCCATC 


TTCACTGTAT 


AGGATGGTTG 


4920 
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AGGAAATCAG 


AGAAAATGCC 


AATAGAACTC 


ATCTTGAAGT 


AGAACATATT 


ATAAAATTGG 


4980 


GTCACTTTTA 


TGGTATCAGT 


CATAAAGCTA 


TGTTATATAG 


ATTGAGGAAT 


GATGGATACC 


5040 


TTCATGCAGA 


AGAAATTAAA 


AATA7GGATA 


TTACTGTTAT 


AGAGACAGCT 


TCAAGATTAG 


5100 


GCTATGATAC 


AAGTTTATAT 


CGTCCTTTGT 


CAGAAAGTAA 


AAAAGAAATG 


GCATTAGGAT 


5160 


AATATATTAA 


TTCAACTGAA 


CAACTTTTAG 


AAAATAACAG 


AATTTCGCAA 


GGGAAGTATG 


5220 


AGGAACTGTT 


ACTAGATGCT 


TTCAGATATG 


ATATTGTATA 


TGGGCTAGAT 


CAAGAGGGGG 


5280 


GAGTTGTCGT 


TTGACTAGTC 


GTGTATTTAT 


TGATGCAGAT 


TGTATTTCAG 


TATTTTTATG 


5340 


GGTTGGCACT 


GAACATCTTT 


TAGAAAAGCT 


CTATTTGGGT 


AAAATTGTTA 


TTCCACAAGA 


5400 


GGTGTATGAT 


GAAATCAATA 


TACCTACAAT 


TCCCCATTTA 


AAATCTAGGA 


TAGATCAGTT 


5460 


GGTAGCTAAG 


GGTTCAGCTG 


AGATTGTGAG 


CATAGACATT 


GGAACTGAAG 


AATACGCATT 


5520 


ATATAGAGAT 


TTAACAAGAA 


ATCATGATAG 


TAACAAGATT 


ATTGGTAAGG 


GAGAAGGGGC 


5580 


ATCTATTTCC 


TTAGCGAAAA 


AGCATAATGG 


GATATTAGGA 


AGTAATAACC 


TAAGAGATGT 


5640 


TAAATCATAT 


GTAGAAGAAT 


TTTCTTTAGA 


ATATATGACA 


ACAGGAGATA 


TACTGATTGA 


5700 


AGCGTTTAAA 


GCGTAATTTA 


TTACTGAATA 


AGAGGGCAAT 


CATATCTGGA 


ATAATATGCT 


5760 


TAAAAAGAGA 


AGGAAAATTG 


GTGCAAATTC 


ATTTTCAGAC 


TATCTTCGTG 


GAAGTATTCA 


5820 


TCAAAATAGA 


CAAAAATAAA 


TTTGGATAAA 


TCGAACTCAC 


TATTCAGGAG 


GCATATGAGC 


5880 


AATTCGAAAA 


AGAAAAGTGT 


CAAATTGAGC 


CTATAGGAGT 


AGAAGTGAAA 


TAGTAAGTCC 


5940 


TGCATAGTGG 


ATGAGAGAAA 


AGTTCTCCTT 


GAAGTTTTCC 


TGAACTATCA 


GTCGCATGTC 


6000 


AAACGATATG 


TAGGGTAATG 


TGAGAGGGGA 


TAGCGAGTAG 


TTTTTGGTTA 


TTTTATCAAA 


6060 


AAAC7TA7AT 


TTTA7TA7AC 


CCAATCA7AA 


AA7A7AA7AA 


AAATOATASA 


ATAACGAAAA 


6::c 


AACATGAATG 


TCAAAAAGAT 


AATGTCAATT 


TTTCAATCCT 


TTTATGTTGA 


TGTCACTATT 


6180 


GAGGAACTGA 


CTTTGACTTT 


ACCAATCAGT 


TTTGTAAAAA 


GCTTTGAGTA 


TACTCAAATG 


6240 


ACTTTTCATA 


AGGAATCATT 


TTTATTGATT 


AAAGAAAAGA 


GAAGGGGGAG 


TTTGAGTTCA 


6300 


TTTGTTACTC 


AGGCTCGCAC 


TATGGGTGAA 


AAAGCCAATA 


TGGATGTTGT 


TTTGGTGTTT 


6360 


TCGAAGTTAT 


CAGACAGTGA 


AAAAAAGCAA 


TTACTTCAAG 


CTAGAGTTCC 


GTTTGTAGAC 


6420 


TTTAAGGGAA 


ACCTCTTCTT 


CCCTCCATTG 


GGACTAGTAC 


TCAATGCGAA TGATACTGAA 


6480 


GTCCCTAAGG 


AATTAACACC 


TAGCGAACAA 


TTAACGTGGA 


TTGCCTTTTT 


ATTGACAAAA 


6540 


GGTCAAAAAG 


TAGTAGATGT 


TGATTTGCTT 


TCACAAGTCA 


CTGGACTTCC 


AAACTCAACA 


6600 


ATTTATAGGT 


GTTTGAGGAC 


TTTTAAAGCT 


TTATATTGGT 


TAAACAAGCA 


AAATAAGCTT 


6660 
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TACACATATA 


CGGTGTCAAA 


GAAAGAATTA 


TTCTTAAAAT 


CCGTGTCATG 


TTTATTTAAT 


6720 


CCCATCAAAA 


AACGGATTTT 


ATTGCCAGAT 


GGCGATATAA 


AGCAGATAAA 


ATCTGTTTCT 


6780 


AACCTTCTAT 


ATGGTGGTGC 


TTATGCTTTG 


TCGCATTCAA 


crrrriTAGc 


TGAAACGGAT 


6940 


GAAAATATTA 


GCTATGTCAT 


ATGGCAGAGA 


AAATTCAATC 


AGTTATCCTT 


GCCACTTTCT 


6900 


CAGCATGTTT 


TAAAATGAAA 


GATGCTAGAG 


ATATGGAAAT 


ATCGTCCTTT 


TGTATCTGAG 


6960 


TTTTGGAATG 


ATTTTAAAAA 


TAATCATGAT 


AAACAATTTG 


TAGATCCGAT 


TTCTCTTTAT 


7020 


TTGACCTTAA 


AAGATGATGA 


TGACCCACGT 


ATAGAGGAAG 


AGAGTGAAGC 


ACTAGAAAAT 


7080 


ATGATATTAC 


AGTATCTGGG 


AGAAGATGAT 


GCCAGCTAAT 


ACGAAACTTA 


TTTTTCAAGA 


7140 


a jiTdTTTficc; 


GATTTTCAGA 


ACTATTATGT 


TCTGATTGGG 


GGAACTGCTA 


CCTCTATCGT 


7200 


Al i V/VJrt 1 1 W 


PAAGGATTTA 


AAAGTCGCAC 


AACAAAAGAT 


TATGATATGG 


TCATCATTGA 


7260 


TYtAAfVFAAAA 


AATAAGGAAT 


TTTATACTAC 


CTTCAATCAT 


TTTTTAGAAT 


TGGGAGAGTA 


7320 




O Af? A A AC&TG 


AGAAAGCGCA 


GCTTTTTCGA 


TTTACAACAA 


CTAATCCTGA 


73B0 




ATCATTGAAC 


TATTTAGTAT 


CTTACCAGAA 


TATCCATTAA 


AGAAGGACGG 


7440 


i LununAn 1 i 




TTGACCAAGA 


TGCTAGTTTA 


TCAGCCTTAT 


TATTGGATGA 


7500 




a atatattgc; 


TGCATGAAAA 


AGAAACCATT 


CAGGGGTATT 


CCGTATTGAG 


7560 






CGAAAATCTC 


TTCAAACCAC 


GTCAGCTTCC 


ATCTACAACC 


7620 




GTTTTGAGCA 


GCCTGCAGCT 


AGCTTCCTAG 


TTTGCTCTTT 


GATTTTCATT 


7680 


unu Ini t t\t\ I 


TATTTTTAAG 

& n A A 1 * A Am) 


GCTAAAGCTT 


GGCTGGATAT 


GAGGGAGCGC 


TCTCCCACAC 


7740 




TTTAAGTAAG 

A 4 m r\JVmM M A\mTk\J 


TCCATTAAAA 


AGCATTTGAA 


TGACCTTACC 


CGTTTGACAG 


7800 




AGGAGATGAA 


AAGTTATCGG 


CTATAACATC 


AAGTAGTGCG 


GTAAAAGCAG 


7860 




CTTTCTGATA 


GAATT AG AG C 


CTGTGAAGTC 


AACTATTCTT 


CAAAATAATG 


7920 


ftfATTTfRTT 

/Var/t 111 * I 


GGATCAAAAT 


GAAATTTTTG 

V m< m* m m* m 


AAATTCTGAA 


AAATTTTCTC 


GATGGTTAAA 


7980 


ATAATTGTAG 


CGAGATGGCT 


ATATTGAATT 


CGTCTATATC 


TGGAAACTAG 


AAAAAAuTTC 




AATTTCAGGA 


GAAAATGAAG 


TCAATCTTCC 


CACAATCAAA 


CGTATAGTAT 


CAAGGTTTTT 


8100 


CAAGACCTGA 


TATTATGCGT 


TTTTTGCTTT 


TCAAAACTTT 


TTGCCCAGTC 


TTCGTTTTTA 


8160 


TCCTCTAGTC 


ACTTGATTTG 


TTTCAGGTGG 


TTTTPTAGTA 


TAGTAGAATG 


AAACGAGAAC 


8220 


AGGACAAATT 


GATCAGGACA 


GTCAAATCGA 


TTTCTAACAA 


TGTTTTAGAA 


GCAGAAGTGT 


8280 


ACTATTCTAG 


TTTCAATCTA 


CTATAGTTAA 


ATCTGCGGTC 


AAGTCTACTG 


GTGAATCTAT 


8340 


GATTGTAATA 


CTCTTCCAAA 


ATCTCATCAA 


CCACGTCAGT 


CTTGCCTTGC 


AGTCTGTATC 


8400 


TTACTGACCA 


AGCTAGTGAT 


GGATTTAGAA 


, TAGGTGATTT 


GGAGCGTCCT 


ATTAGCTAGG 


84 60 
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AAATGCTCCT 


CATAGTCCTT 


TGCTGAGGCT 


AGGGTCTTTC 


AACATTCAAC 


ACTCAACTGG 


8520 


TTGATCTAGT 


TGATAGGAAG 


GGAGTTACTA 


TAAAATACTC 


AGGCTTCCAT 


CATATTTTTT 


8580 


GAAACGATTG 


TGTAATCAAA 


ATGTACCAAT 


ATTGTAGTAT 


TGGTACAGAA 


GATGTTGTGA 


8640 


ATGGATAAAT 


ATATCATAAC TGCTATCTCA AAAAGATTTC 


ATATGTCTGT 


GCATATATAA 


8700 


TAGACTTCCT GCAAAACTAG AATCCTAGTT 


CATGATTGAT 


AATACCAGCA 


ATCAAATTCA 


8760 


TTCGTAATCC 


AAAGCGTTTA 


CGATGATTTC 


GATAGGTTGT 


TGAAAACATT 


TTAAACGTTT 


8820 


CTACTTTGGC 


AAAGATGTTC 


TCAACCTTGC 


TTCTCTCCTT 


AGATAGCGCA 


TGGTTATAGG 


8880 


CHTATCTTC 


AGCTGTTAGC 


GGCTTGAGTT 


TGCTGGATTT 


ACGTCGACTT 


TGTGCTTGAG 


8940 


GACATATCTT 


CATGAGCCCT 


TGATAACCAC 


TGTCAGCCAA 


GATTTTACCA 


GCTTGTCCGA 


9000 


TATTTCTGCA 


ACTCATTTTG 


AACAACTTCA 


TATCATGACA 


ATAGTTCACA 


CTGATATCCA 


9060 


AAGAAACAAT 


TCTCCCTTGA 


CTTGTGACAA 


TCGCTTGAGC 


CTTCATAGCG 


TGAAATTTCT 


9120 


TTTTACCAGA 


ATCATTCGCT 


AATTCTTTTT 


TTAGGGCGAT 


TGATTTTTAC 


TTCCCTCGCA 


9180 


TCAATCATTA 


CCGTGTCCTC 


AGAACTAAGA 


GGAGTTCTTG 


AAATCGTAAC 


ACCACTTTGA 


9240 


ACAAGACTTA 


CTTCAACCCA 


TTGGCTCCGA 


CGGATTAAGT 


TGCTTTCGTG 


AATACCAAAA 


9300 


TCAGCCGCAA 


TTTCTTCATA 


AGTGCGGTAT 


TCTAGGCTTA 


ATTTAGGTTT 


TCGTCCACCT 


9360 


TTTGCGTGTT 


TAAGTTGATA 


AGCTGTTTTT 


AATACAGCTA 


ACATCTCTTT 


AAAAGTCGTG 


9420 


CCCTGAACAC 


CAACAAGACG 


CTTAAATCGT 


GTATCAGTTA 


ATTGTTT ACT 


TGCTTCATAA 


9480 


TTTCGCAGGG 


AGTCTATTGA 


CTCTTTGGTA 


GGTCTCAATG 


TTTTTTTCAT 


CTATCCCGAG 


9540 


AATTATTTTC 


CCGCCATTTG 


TATTTGCAAA 


TGCTGAGTAG 


GTTTCCCAGA 


AAGACTCTGG 


9600 


AAGATTGTTT 


TTAGCTTTTT 


TCTATTCTAA 


ATCAACCCCT 


TCAAATTTTA 


ACTCCATATT 


9660 


TTTCCTTTAC 


ATCTGTTTTT 


TGTGGTTCTG 


GTATTTGTTC 


AAGTTGAGTG 


ATAATATAGC 


9720 


GAATTGAATT 


TCGAGAGTTT 


TTACTCAGTT 


AATTTCTTTT 


TTAACCCACT 


TTAATTGCTT 


9780 


TTTTAACACG 


GGTTAAAAAA 


GAAATTAAAG 


TGGGTTAATT 


TTTCTTGA 




9828 


(2i INFORMATION FOR SEQ ID NO: 42 


* 









(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3369 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
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CCGCGAAAGA 


TATTTTTGAA 


CAAGAGTTTG 


GACGTGAGGT 


CeST&.'TPAT 


AATAAAGTAG 


60 


AAGTTGACGA 


GTTTTTAGAC 


GATGTCATCA 


AGGACTATCA 


AACCTATGCT 


GCCTTGGTCA 


120 


AGTCACTTCG 


TCAGGAAATT 


GCGGATTTGA 


AGGAAGAATT 


AACTCGTAAA 


CCGAAACCTT 


180 


CACCAGTTCA 


AGCAGAACCC 


CTTGAAGCGG 


CAATTACAAG 


TTCTATGACG 


AATTTTGATA 


240 


TTTTGAAACG 


CCTGAATAGA 


TTGGAAAAAG 


AAGTTTTTGG 


TAAACAAATT 


TTAGATAACT 


300 


CAGATTTTTA 


AGTAGTTATT 


TGAGATGTGC 


AATTTTTGGA 


TAATCGCGTG 


AGGAGAATTG 


360 


TTTCTCATG A 


GGAAAGTCCA 

^mi^mff mm mm m^mi mt ^r- » 


TGCTAGCACA 


GGCTGTGATG 


CCTGTAGTGT 


TTGTGCTAGG 


420 


CGAAACCATA 


AGCCTAGGGA 


CGAGAAATCG 


TTACGGCAGT 


TGAAATGGCT 


AAGTCCTTGG 


480 


ATAGGCCAGA 


GTAGGCTTGA 


AAGTGCCACA 


GTGACGGAGT 


CTTTCTGGAA 


ACAGAGAGAG 


540 


TGGAACGCGG 


TAAACCCCTC 


AAGCTAGCAA 


CCCAAATTTT 


GGTCGGGGCA 


TGGAGTACGC 


600 


GGAAACGAAC 


GTAGTATTCT 


GACTGCTATC 


AGCTAGAGCT 


GTTAGTGGTA 


GACAGATGAT . 


660 


T ATCGAAGGA 


AGTGGTCCTA 


GTCACTTCTG 


GAACAAAACA 


TGGCTTATAG 


AAAATTGCAT 


720 


ATAGGTTGGG 


GCTGAGAAAT 


TTTCTCAACC 

m> m m> m ^m*m mm m^m ^m 


TCATTTTTTA 


AAGTGGACAT 


ATAGAAAGGT 


780 


CTTGCAAGAC 


TGTAACATGA 


AAAAAGAATT 


TAATTTAATT 


GCAACTGTGG 


CAGCAGGGCT 


840 


TGAGGCTGTC 


GTTGGTCGTG 


AAGTCCGAGA 

mrnrn m*mr m ^mt *m m • 


GTTGGGCTAC 


GATTGTCAGG 


TTGAAAATGG 


900 


ACGTGTTCGT 


TTTCAAGGAG 


ACGTGAGAGC 


TATTATCGAA 


ACCAACCTTT 


GGCTTCGGGC 


960 


AGCAGATCGT 


ATCAAAATTA 


TCGTAGGAAC 


GTTCCCAGCT 


AAGACTTTTG 


AAGAGCTATT 


1020 


TCAGGGAGTT 




ATTGGGAAAA 

■ • • m ^m T ^*^^m' a * 


TTATTTACCA 


CTTGGAGCTC 


GGTTCCCCAT 


1080 


TTCAAAAGCT 


AAATGTGTTA 


AGTCCAAACT 


TCACAATGAG 


CCCAGTGTTC 


AGGCTATTTC 


1140 


TAAGAAAGCT 


GTTGTCAAGA 


AATTGCAGAA 


ACACTATGCT 


CGCCCAGAAG 


GGGTTCCTCT 


1200 


GATGGAGAAT 


GGCCCAGACT 


TTAAGATTGA 


GGTCTCTATT 


CTCAAAGATG 


TGGCAACTGT 


1260 


CATGATTGAT 


ACGACCGGGT 


CTAGCCTCTT 


TAAACGTGGT 


TATCGTACCG 


AAAAAGGTGG 


1320 


CGCTCCTATC 


AAGGAAAATA 


TGGCAGCAGC 


CATTTTACAA 


CTTTCTAACT 


GGTATCCAC-A 


1380 


CAAGCCTTTG 


ATTGATCCGA 


CCTGTGGTTC 


GGGGACTTTC 


TGTATTGAGG 


CAGTTATGAT 


1440 


TGCTAGAAAG 


ATGGCGCCAG 


GTCTTCGTCG 


CTCTTTTGCA 


TTTGAGGAAT 


GGAACTGGAT 


1500 


CAGCGATCGC 


TTGATTCAAG 


AAGTGCGCAC 


AGAAGCGGCT 


AAAAAAGTAG 


ACCGTGAGCT 


1560 


TGAGCTGGAT 


ATCATGGGCT 


GTGATATTGA 


TGCTCGCATG 


GTGGAAATTG 


CTAAGGCCAA 


1620 


TGCTCAGGTA 


GCTGGTGTTG 


CAGGAGACAT 


TACTTTTAAG 


CAGATGCGCG 


TGCAGGATTT 


1680 


ACGTTCCGAT 


AAAATCAATG 


GAGTAATCAT 


TTCCAATCCG 


CCTTATGGTG 


AACGTTTGTC 


1740 


AGATGATGCA 


GGGGTGACCA 


AGCTCTATGC 


TGAGATGGGG 


CAAGTATTTG 


CACCX3CTGAA 


1800 
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AACTTGGAGC AAATTTATCC TGACTAGTGA TGAAGCTTTT GAAAGCAAGT ATGGTAGCCA 1860 

AGCAGATAAG AAGCGTAAGT TATACAACGG AACCTTGAAA GTGGATCTAT ATCAATATTT 1920 

TGGTCAGCGT GTCAAACGGC AAGAGGTAAA ATAGAAAGGG ATACTCATGA GTAAAAAAAG 1980 

ACGAAATCGT CATAAAAAAG AAGGTCAAGA ACCGCAATTT GATTTTGATG AAGCAAAAGA 2040 

GCTAACAGTT GGTCAAGCTA TTCGTAAAAA TGAAGAAGTG GAATCAGGAG TCTTGCCTGA 2100 

GGATTCCATT TTGGACAAGT ATGTTAAGCA ACACAGAGAT GAAATTGAGG CGGATAACTT 2160 

TGCGACTCGT CAATACAAAA AAGAGGAGTT CGTTGAAACT CAGAGTCTGG ATGATTTAAT 2220 

TCAAGAGATG CGTGAGGCTG TAGAGAAGTC AGAACCTTCT TCGGAGGAAG TTCCATCTTC 2280 

TGAAGACATC TTACTACCCT TGCCTCTGGA CGATGAGGAG CAAGGCTTGG ATCCTCTATT 2340 

GCTAGATGAT GAAAATCCAA CAGAAATGAC TGAAGAAGTG GAAGAGGAGC AAAACCTTTC 2400 

TCGTCTGGAT CAAGAGGACT CAGAAAAGAA AAGTAAAAAA GGCTTTATTT TGACCGTTTT 2460 

GGCGCTTGTA TCAGTAATTA TTTGTGTCAG TGCTTATTAT GTCTACCGTC AAGTGGCTCG 2520 

TTCGACTAAG GAAATTGAAA CTTCTCAATC AACTACAGCC AATCAATCGG ATGTGGATGA 2580 

TTTTAATACA CTTTATCACG CCTTTTACAC AGATAGCAAT AAAACGGCTT TGAAAAATAG 2640 

CCAGTTTGAT AAACTGAGTC AACTCAAGAC TTTACTTCAT AAGCTGGAAG GTAGTCGTGA 2700 

ACATACCCTT GCCAAATCTA AATATGATAG TCTAGCAACC CAAATCAAGG CTATTCAAGA 2760 

TGTCAATGCT CAATTTGAGA AACCAGCTAT TGTGGATGGT GTGTTGGATA CCAATGCCAA 2820 

AGCCAAATCG GATGCTAAAT TTACGGATAT TAAAACTGGA AATACGGAGC TTGATAAAGT 2880 

GCTAGATAAG GCTATCAGTC TTGGTAAGAG CCAGCAAACA AGTACTTCTA GCTCAAGTTC 2940 

AAGTCAAACT AGCAGCTCAA GTTCAAGTCA ACCAACTTCA AATACGACTA GTGAGCCAAA 3000 

ACCAAGTAGT TCAAATGAGA CTAGAAGTAG TCGCAGTGAA GTCAATATGG GTCTCTCGAG 3060 

TGCAGGGGTT GCTGTTCAAA GAAGTGCCAG TCGTGTTGCC TATAATCAGT CTGCTATTGA 3120 

TGATAGTAAT AACTCTGCCT GGGATTTTGC GGATGGTGTC TTGGAACAAA TTCTAGCGAC 3180 

TTCACGTTCA CGTGGCTATA TCACTGGAGA CCAATATATC CTTGAACGTG TCAATATCCT 324 0 

TAACGGCAAT GGTTATTACA ACCTCTACAA GCCAGATGCA ACCTATCTCT TTACCCTTAA 33 00 

CTGTAAGACA GGCTACTTTG TCGGAAATGG CGCTGGTCAT GCGGATGACT TAGATTACTA 3360 

AGCAGTCGG 33 69 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9713 base pairs 



WO 98/18931 PCT/US97/19588 



406 

IB) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

AAGTTTACAA TTTAAATGAA TTAACAATTT TCCCAACTAA AAGCACTCCA GTTACCGCAA 60 

CGTTTGTACT GAATGTACTA AATCGCATTC CATCAACTTC ATCTGTTTCG TCAACTTGAA 120 

CAGATACTAA TTGAAGATTT AATACTTCTG CTGCCATAGC TAGCTCCTCC TATTTAAATT 180 

TTTGGGATTA AGTACTTTAT CCACCCTCAT ATACTCTCTC CACCAGTAAA ATGCAAGCAA 240 

TGATACAAAA TAGATTTAAC TATTTTATAT AGCGAAAACT TACAAATTTT TAAGAAATAA 300 

TTTTTGCATT CTTAAAGATA AAATAGGAAC TTTTAGTAAT AAATATTAAA ATAAATAAAA 360 

TAATAGATAC TATAAAATTT GGAAGTATTA ACCCCAAAAG ATTCATATCA TCTATTAAAA 420 

TATCCTCTAA AGAGTAGTAT ATTAAAGCCA TAATTTTAAT GTTAAGTAAA AATGCAATTA 480 

ATGAAGTAAC AAATGTCAAA AATATAGCCT CACCAACTTT AATCTTAACC ATCTGGTAAT 540 

TAGAAGTTCC TAAAATTTCA AATTGCTGAA TCTCAATCCT TTCTTGATGC GATGACAAAA 600 

ATGCAATTGA AATAATATTT GCAAGTACTA TCAAAATTGG TGCTCCTACA TAGACAATAA 660 

ATGCTACTTT TAGCTCTAAA TCACTGTCAT CTTGAAATTG AGATAGTATA TTCTGAGAAA 720 

TCATTTGAAA ACTAGAAATT AGTAATATAG CTCCTGTAAT TGCAGCACTG ATAGATTTTA 7 80 

TATAAGACTT ACAATATAGT AAATTCCACT TCGAAACAAT GAACATAAAA TTATTTCTAA 840 

ATATAATTAT AGAAAGTAGT TTGATAAAAC ATGACTGTAT AAAAGGAGAT AATTGATAAA 900 

TAATCACAAT ATCTAAGATT ACAATATTGA ATATTATCTG GGCCTTCGCT AAAATTGTGC 960 

TATCTTGGAA AATTTGTTGC AAAGAAAGCA ACCACATAAC ACTAAAACCA GCCAATAGCA 1020 

GTATTCTTTT TACTATTGAA AGAACATGCC TTATTTTAGA ACTCTTCCTA TTTCTAATCT 1080 

TCTTGAACGT ATAAAAGCAA CCACTTAGAA AGGCTAAAAA TGAAATCAAC ACTACTGTAA 1140 

TGATACATCC AACAGCACTC GTTTGAAATT GGATATCAGG TAATATATTT TCCCCGAAAA 1200 

AGTATTGTAA AAAATAATAA TAATTTGACG TAACAAATAT AGAGCATAGA TATGCAATAA 1260 

AACTAATAAT CGAGGAAATG ATAAAAATCT GTCCCCCCAC AAGAAATGAT AGTTGAAGGC 1320 

GACTTGCTCC CAACACCTCC AGAAGTTCGT AATCATCTCT AAAAATTTCA ACCAACATAT 1380 

TTATTATGTT AGAGAGCACA AAGAATAATG TTACTCCTCC GAATACTATC GGAAACATAA 1440 

AAATTGGTTT AGGATCTGGA AGTCCGACAA ATACTTGCGA ATTATTCTCA ACATTAATTA 1500 

CCCCATTAAC AGCCAATCCC ATAACTAAAC TCGAAACAAA AATTACTGGT GAAACGCCTA 1560 
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ACCATTGTTT 


CTTATTATGT 


AAAAATTGAT 


AGTAAACTAA 


TCTGAGCATC 


TCTATTCCTC 


1620 


CGTAGTTGAT 


TGTACCTCTA 


AGATTTTATA 


CAACTCTTCC 


CCGCTAGGTC 


TATGAAGTTC 


1680 


TTTGAAAATT 


TTTCCATCTT 


TCAATATTAA 


TGCACGATCA 


GTTTTCGAGG 


CCAATTCTAT 


1740 


ATCGTGCGTT 


ACCATAATTA 


CACACTTACC 


CGCCCCTACT 


AACTCTCTCA 


ATAATTCAAA 


1800 


AATTACTTCA 


CGAGAAACGC 


TGTCTAAAGC 


CCCAGTTGGC 


TCATCAGCAA 


ATATTATATC 


1860 


ACTATCAGCA 


ATAACCGCTC 


TAGCTATAGC 


AACCTTCTGT 


TGTTCTCCAC 


CAGACAGAGT 


1920 


TCCAACAAAA 


TCGTTTAAGC 


CAGCATTAAA 


CTTCATTCTT 


TTGAGTAAGT 


TTTCTACATT 


1980 


TTTAATAGTT 


AATTTTTTTT 


GTGATAATCG 


CAAAGGAAGT 


GCTATATTTT 


CTATTACCGG 


2040 


CAGGGAAGGT 


ATTAAATTGT 


ATGCTTGAAA 


TATAAAAGAT 


ACTTCGTTAC 


GTCTTATACT 


2100 


TGACAATTTT 


GCATTTCTGA 


TTTTATAGGG 


GTTGATTCCA 


TTTAAAATTA 


CTTCCCCACT 


2160 


TGTTGGTTCA 


AGCAAACTAG 


AAATACATTT 


TAATAAAGTT 


GACTTTCCAG 


AACCACTAAT 


2220 


TCCTAGAATA 


CTTATAAATT 


CTCCTCTCGA 


AGCAGAAAGA 


CAAACATTTT 


TCAGCACTTG 


2280 


CAACGTTTTA 


TTATTTCCTA 


GTAAAAATTG 


ATGATACAGC 


CCTTTCACTT 


TTAATATATA 


2340 


ATCTTTATCC 


ATATTCTTGC 


CTCCAATCAC 


TTAATTTTGA 


AAAGTGTTCC 


ATTTTCCAAT 


2400 


TTATATATAT 


CAGTGTATCT 


CTTGTCATTT 


AAGTCATAAT 


GATGTGAAAC 


TTCAATAAAT 


2460 


GAAATACCTA 


AATTGAACAG 


AATATCATGT 


ATGGAATTTG 


AATTATCATT 


ATCTAAATTA 


2520 


GCTGATATTT 


CGTCAAATAA 


GTACACTTTA 


TTATTTCTAA 


TCAGACCTCT 


AGCTAAACCT 


2580 


ATTTTTTGTT 


TTTGACCTCC 


AGAC AAATT A 


CTACCATTTT 


CACCACATTG 


ATAATTTAGT 


2640 


ATATCTATCT 


TTTCTAATTC 


TTCATATAGA 


TTTACCTTTT 


TTAACACCTC 


AATTATCTGA 


2700 


TCATCTGAAA 


AATATTCATT 


TTGAAATAAA 


GTTACGTTCT 


CACGAATAGT 


AGTGTCAAAA 


2760 


ATATATGGTG 


TCTGATCAAC 


TGTTGGTATT 


GAATCTGAAC 


TCTTTTTCCC 


ATGTGATAAC 


2820 


AAATTTACAT 


AACCTTTTTG 


TGGCTTTAAA 


GAACCATTAA 


TTAAATTTAA 


AATCGTTGTT 


28B0 


TTCCCACTAC 


CAGAAGTTCC 


TCTTAATAAT 


ACCCTAAATG 


GTGACTTAAA 


TGAGAAGTCA 


2940 


ATACTTAATT 


TATTTTCTGG 


TGTAATAGAA 


TATACAACAT 


CTTTCATGTG 


TATCTCATCT 


3000 


ATTGATGAAG 


TATACAGTCC 


GTTATTATCA 


TGTTCAGCGT 


CTATAAAATT 


CTTCTCTCCA 


3060 


CTTAAGTATT 


TTAAAAACGG 


TTTCCTTAAA 


TCTTTGGTTG 


TATTTATCTT 


ATTTAATGAA 


3120 


TAGGCAATTG 


ATTGTATCGG 


CCCTAAAACT 


TTATCGTTTG 


CTAAGAAAAT 


ACCTATCAGT 


3180 


TCACTAAAAG 


AAAGGCTTTT 


ATGATAAATT 


ACAAAATAAC 


ATCCTACAAC 


CAAGGGAACT 


3240 


AGAAAGCAAA 


AACCTGAAAT 


TAGTACTGCA 


ACCAATTTTG 


AAAGAACCTC 


TGATCGTTTC 


3300 
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AAATTAAAAG TAGAATCTTC TAGTTTATCC AACTTTTTAT CCGACAAACT AATTATTTCT 3360 

TTAGTAACAG AATAAGATTT TAATGTCTTA AAACCATTAA AAATTTCTTT TATTATGTGA 3420 

GTATACTCTG CATTGCTGTT AGAGTACTCA TTAGCTGAAT TAGACAACAT CTTCTTCATA 3480 

AAGACAGGTA CTATAATCGG CAATGCTGAT AATACAATAA ATATTATTGA nACTAGGAAG 3540 

TTTAAATAAA GCATAAAACT TAGAGAGACG ATGAACAACA ATATTGAAGA AATTATTTCA 3600 

AAAATTTGTC TAAAATAGTT TTCTTCGATT AATCTCAAAT CATTTGACAA AACTGAAATA 3S60 

ATAGATGAGT AATCTTTAAC CATTTCAGAA GAAAGATACT GTTCTCTAAA ATATCCTTGT 3720 

TTAATTTTTA CATTTATATC TTTAGTTATT GATGCTTCCG TTACTTCTAA ATAGTAATTT 3780 

GATATATAGA TTGCTCACCA ACCCAGAATA CTTATAGCAC CAAATCTTAG AACGTCAGAA 3840 

AATGAGGAAG TCTGATTTAA ACTACCTGCA TATACAATAA TTCCTGAGAG CAAGACACCA 3900 

TTAAACGAAG ATAGAAATAT TAAAATCCCC ATTAATATAA GTTTAGTCTT TTTTATAAAT 3960 

TTTAAATAAT TCATAAGTTA TTCCTTCCCA CTTCTTCAAA GAAATAATTT AAAGTATCAA 4020 

TCATTAAGAG AACATCTGAT GGAGTAAAAC CTCCATGACC AGCTGCTTTG TTTAAATACA 4080 

ACAAACTTTT AACTCCAATA GAATTTAATT TCTTTGACCA CTCTATCACT TCGTTATTAT 4140 

TAATATATGG GTCTTTCTCA CCCAAAATAT TAACTATAAC AGTATTTGAG TCTCGTGCCT 4200 

TTTCAATATT TTGCATAGGC GAATATGACT TTATATAAGC CTTTACTTCA GGGTCTCTAA 4260 

TATCTCCCCA CTCTGCTATT TCGGTCTTAG AAAGACCATC ATTTCGATTC TGAAGTGTAT 4320 

CATAAGGATT TATAAATGGC GAAAATAAGA GAATGCTTTG CAATAAATTT TTTTCCTCGT 4380 

TCAACACCGC ACCAGCAATT ATTCCACCTG CACTAGAACT TATTAAACCT AATCGCTTAC 4440 

TGTCAATTAC ATCATTTTCC CTTAAATAAT TTACTCCCTC AATAAAATCT CTGATAGAAT 4500 

TCCATTTGTT TAACGCCTTT CCTGACCGAT ACCATTCACC ACCCAAATAC CCTCCACCTC 4 560 

TTACATGAAC TATAGCATAA ATAAAACCTG CATCTATTAT AGATAACATA ATTTCATCTA 4620 

AATCAGAATT ATCATTCTTA CCATAAGCCC CATAGACACT TAGAATACAT TTTTTTCTI'C 4680 

TTGGGAGCTC ATCCGTATCT TCACTTTTCC AAAATAAAGA AATCGGTATG CTTACATCAT 4740 

AACTGTCTTT TTTAGTCCAA ATCACCTTAG AAAAATATTT AGTATTATTC GATTTTATGA 4800 

TGGGTCTTTC AAATTCAGTT TTTAATGTAT TTTCTATTAA ATCAAAACTA AGTATTTTTT 4860 

CGTAAAAAGT TCTCCTCTCT AAAAACAGAA GAACACGATC AGAAAATGAA TTTTCATAAA 4 920 

GTGTTGTCTT TTCATCAAAT GTTATCTTAT TAACACTCAA CTCCCTCAAA CTATTATTTT 4 980 

TAAATGTAGC AAGATAAAAG ACGGAATTCG CTGCGTTTGA ACAGTCTAAA AGGATATAAC 5040 

GTCCTATACA GTGAACTCTT CTAGCCCTAT CTTGATATGC TATAGTAATA GAAACTCTGT 5100 
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CTCCCGAAGA AGTTTCCCTT AGAATTAGTT GATCTTTCTT TTCTTCAGTT GAAGAGAGCC 5160 

CAAGAAAGTA CTGTGCTTTT TCTGTACTAA ATAGAGCGAT ATCTCTAGGT GTTGGGGCTA 5220 

CCCTTTCTGT GTAAGAGTGT CTAACAAAAC CCGTCCGGTC GAAACTGTAT AGAAAAATCC 5280 

TGCCTTTCTG AAAGTCTACT GACTTTACAA AACAATTATT GCTATCAATG TGGACTATTT 5340 

TTAATCGAAA AGAGCATTCG TTTTCTTCAA ACAGTTCCTC TTCTGTAAAG CTATCAAAAG 5400 

ATTTATAGAA TAACTTACTT GGCCTCCCGT ACTCTTTGGA GCGAGTATAC ATAACACCGA 5460 

ATTTACCCAA ATAGAACGAA CTTTCTACTG AAATATCTTC AATGATAAAT AACTCTTCCA 5520 

TAGTATATTT TTTTATTCCA ATTAAATTAG TCGTACGCAG TGAGGATACA ACCAAAACTA 5580 

TATAACTCTC ATCAGATGAA ATCCTAACAT CCTGTAAGAT ACTATCATCT GGCAAAGTAT 5640 

ATTTTTCCAC ATCAAAGACA ATTTTAAGTG AATTTGAATT GTCTAAACTG GAAGAACTAA 5700 

CCTTAGGAAT CCAGTCATTA TCTTCGACAT ACCATTCCTT TATTACACCA GTATTGGGTA 5760 

TACTCCAATT ATCAAATTGG TACCAATATC GCCCTCTCCT AAATATCAAA GAATTCCATT S820 

TTTTTAATTC CTGAAATGAT GAAGAGATAG ACCTCTTATA GTGTGTTTTT TCCTGTATTC 5880 

TATTTAAAAA TATTTCATTA CTCTGATTCA CAAGTATGAC CCCTTAATAA TGGTATCTAA 5940 

ATATTATATT TGAGGAAGAA TCGTCAATTT ATTATCCATT ATTGATACCA ATCCAATTGC 6000 

AACACCCGCA AATCCCGAAG CAATATCTGT TGTTATCTTT AAACCATTAT CTCCCGCAAT 6060 

AACAAATCCT TCTTCAATTA CACACAAATA TCTATAAAGT TGTTCAATTA ATTTCTTTTG 6120 

TCCTGAAAAG TTATCATCGA TATCACTATA TATATTATTA GCAACTTCAA GACCACAAAA 6180 

TCCGTTAAAT AAACCTGGTA ATACACAAAA AACTACATCA GTTGCCCTCT CTAAAGAAGT 6240 

TAAATATTTT AAGTATTTGC TTGACAAGAT TTCTTTATTT CTATTAATAA GTAAAAGCAG 6300 

GCCAGCACTT CCACTTGCTA CATATGGTAG TAATCTATGA CCTTGGCTGT ACTGCAATGA 6360 

ATTATTACTA TCTACTTTAT AAGCAACTAA TTCTTTATCT ACAGCCAATT CTAGACCATT 6420 

TTTATAGATA CTTTCACCAG TTAATTTATA AGCTTCACCG AAGAGCCAAG CTACCCCTGC 6480 

GTGACCATAT AGTAATCCAC CAAAATTCTC ATAAGGATCG TTACTCTGAA CATCACTAGC 6540 

GCCAACTTTA CAAAAAGTTT CTGGATTTTC TATATAATTT AAAGTATATT CTCTAAGCCT 6600 

AATTAGTATT TCTTCTCCTA GTTTATTATC AATTCCCCCT TTACTAAGAA AATACAGTCC 6660 

AACCAGTAAA ATTCCAGCCT GCCCACTATA TAAATTTTTA TTTTGTGAAT TCTCAAATAT 6720 

CTCTATAAAA TGAGTTGTAA AAAGTTCAAC TGCCCGATCT ATCTCCCCAA ATTCATAAAT 6780 

GAGCCAGATT GTACCAATTT TACCATCAAA AAGACCAGAA AGGGACGATT TCTTAAAATT 6840 
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ATTTACTGCC 


TCATTAATAA 


CCTGTGTTCG 


AATCTCATAA 


TAGTCATCAA 


ACTTGAAATT 


6900 


TTTTACTTTC 


TTAGCTAGTT 


GTTGATAACT 


CCAAAGGATA 


GCTAAATCTG 


AAAACGCAAT 


6960 


TCCTTGATTA 


AAATTCAGAC 


CATAATAATG 


AACTGGGAAG 


AATCTTGATT 


GAAATTCTTT 


7020 


ACGCCACTGT 


CCATAAGTTA 


GCGTAAACCC 


TCTCAATAAT 


TTTATAATAA 


AATCTTGTAT 


7080 


ATCTTGCTCA 


CTCTCGATAG 


TTCTAATCTC 


ATGCATGGGT 


TTTAAAACTT 


TTTTCCTGGA 


7140 


AATATTCTCA 


ATCTGTGCAC 


ATTTAGAATC 


TAGATATGAC 


AATAAACTTT 


CTACATAATC 


7200 


TATATGTTCT 


CTTGTATAAC 


CCAAAGACTC 


AAATAGTTTT 


TTTCCTTCTA 


TCCTGGTTTG 


7260 


ACTTACATAG 


TTGTATGTCA 


AATCCGATGT 


AGTTACTAGT 


GGCATGTATA 


AATAATGAGC 


7320 


TATTTGTCTA 


ATACCATACC 


AATCTATCTC 


ACTGGGAAGT 


GTTTCTCGCC 


ATGCTCTAAA 


7380 


ACCAGGGGCT 


GCAACTTTAT 


GTACAACTTT 


TTCATCATTT 


GAAAAGACAG 


CCTGTTCCCA 


7440 


GTCTATTATA 


CTAATCTCAT 


CTTCATCCTT 


AACCAAGATA 


TTTCCTAAAT 


GTAAATCTTG 


7500 


ATGATATACA 


TTTTCAGAAT 


GAAACTTATT 


CGTTAAATCG 


ATGAGTTTTT 


CTACTATCTT 


7560 


TGAAACTCTC 


AATAGATAAT 


CTTTGGTCTT 


ATCAACAACT 


TCATATAAAG 


GAAAATTATT 


7620 


GGTAACCCAT 


CTATTTAGTG 


GAACGCCCTT 


CATATGTTCA 


ATTCCTAAGA 


AGGTGTGCTC 


7680 


CCAGATCTTA 


CCGTGCCAGT 


ATATTTTAGG 


CGTCTCACTC 


CATTCATTTA 


GAATTTTTAG 


7740 


TGCTTTGCAC 


TCCGAAGCTA 


ATTTCTCTGA 


AGAATAAGTA 


CCATCAAATC 


CTAGACCTGT 


7800 


ATACGGTCTA 


GCCTCTTTTA 


AAATTATTTT 


TTTCCCATCT 


TCTTTTAGCC 


TAGCATTATA 


7860 


TATCCCACCA 


CTGTTTGAAA 


ATCTAATTGC 


ATTATCTATA 


ATAAAGGGAA 


AGTCTCCCTG 


7920 


TTTTTTATCT 


TTCTTGTCAA 


GCCATTTATT 


CAAAAAGTCA 


GGGGGCACTA 


TACCTTTTGG 


7980 


AATTTTAAAT 


ACTGGTAAAC 


GTTCATCTTT 


AACAACTTCA 


TCGCCAACAA 


TTAATTCATC 


8040 


AATAGCAACC 


TTCTTTTCAT 


CATCCCTTGA 


CGGCCTAAAC 


ACACCATACC 


TCAGATATAT 


8100 


TGGTGCTTCA 


TCCCAACGTT 


TATCGCTTAA 


AATATATGGC 


CCATTATATT 


GCTTTAAGGC 


8160 


ACTTTCTAAC 


CTTTGCAAAA 




I l l_A 111 I un 


111 uun 1 /vi^> 




8220 

V 4* V 


TTTACCAGAA 


AATCCTCGAC 


TAACCAATTT 


CCCGTTTCGC 


ATGATAAATT 


TGTCTTCTGT 


8260 


ACTAAGATGT 


TTAAATGGAA 


TTCGCATTTC 


ATGGCAAATT 


TTTGCTACAT 


CTTGTAACAA 


8340 


TTCATCTGAA 


CTGTTATACT 


CTGAACTAAT 


GTGTATTTTC 


CACCCTTGTC 


TTTCAACAAA 


8400 


TTTTCCAATA 


GGGTATTGAT 


AAACCCACTC 


ATCATTATTC 


ATTACTTCGT 


GCCAATTAAA 


8460 


AGGCAGACTT 


ACTTGGTACT 


TTATGCTAGT 


ATCTGTACTA 


TAATCATTAT 


TAGTGAAAAA 


8520 


GAAAGGATGC 


TCCAAATTGA 


AATTATAATC 


CATAACAAAA 


TCTCCAAGAA 


ATTTTATCAA 


8580 


ACTTAATATA 


TCTATAGCTA 


GACAGACTTA 


TTTAAATAAA 


AAGGGAGAAT 


CCTTTGGATT 


6640 
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CTCCCCATAT AAGCACTAAC ATTCCAACGT GCACATATTG GAACGACATC CATAACTCCA 8700 

CAGAATCTCT AAAGTTTACA ATTTAAATGA ATTAACAATT TTCCCAACTA AAAGCACTCC 87 SO 

AGTTACCGCA ACGATTTGTA CTGAATGTAC TAAATCGCAT TCCATCAACT TCATCTGTTT 8820 

CGTCAACTTG AACAGATACT AATTGAAGAT TTAATACTTC TTCTGCCATA GCTAGCTCCT 8880 

CCTATTTAAA 7TTTTGGGAT TAAGTACTTT ATCCACCCTC ATTATACTCT CTCCACCAGT 8940 

AAAATGCAAG CAATTATACA ATGTTGTCAC ATAGAAAATA ATGTTTCCGT AACTTTTCAA 9000 

AGTAACTTCC ATCTCTCTCC CAAAACTGGA AGTTAGTTTT AGAAGTTACC TAAAAATCAG 9060 

GTCACCTATT TTAAAAAAGC AGCAAACTAT AAACTACTAG GTTCCACACC AAATGTAGTC 9120 

CCATACTGCC CCATAAGTCA GATTTATAGC GCACCATACC TAAAAACATC CCAAGTGAAA 9180 

CATACAAACA CCAAGCTAGA ATGGTTCCTG TATGATGTGC TAAGGCAAAT AAAACACTTG 9240 

TCAAAGCAAC TCTGATATCT AATTTTCTGA CCAAATTCCA TAAAATTTCT CGATACAGAA 9 300 

ATTCTTCAAC CATACTCGCA TTGATTAAGA ACAATAAAAA TGAAAACCAA GGAATTTGAT 9360 

GTTGAAGGCC AATTAAGTTT GCTTGATTCG TGCTTCCTTG AGCATGAATC AGACTAAAAC 9420 

ATAGACTTAT AATCAGTAGG CTAACAAATT CAACACCAAG CCATTTCATC CTAGATTTCA 9480 

TATTGACCTT ATGCGCTTGT TTGCGTTGGC CATACATCCA TAAAAAAGAA ATGAGTGACG 9540 

AACCATAGAG AATCTGTAGT ATAGTTmACT CACCGATACA AAGAAATTTC AATAAGTATA 9600 

GAGrTACCAA TAsGACATTT ACTTGTTGGA ATATATAAAC TGGAATTATT CTTTTCATAG 9660 

TTACCTCCGA AATAAATCTT CATAATCTAA ATCTAATACC TGCACAATCC TTT 9713 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8657 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

Ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

AAAGAAATTG TCAGAGAGTG GCTAGATGAA GTAGCAGAGC GGGCTAAGGA CTATCCAGAG 60 

TGGGTGGATG TTTTCGAGCG TTGCTACACC GATACCTTGG ACAATACGGT TGAAATCTTA 120 

GAAGATGGTT CAACTTTTGT CTTGACTGGG GATATTCCTG CCATGTGGCT TCGAGATTCG 180 

ACAGCCCAAC TCACACCCTA CCTTCATGTA GCTAAAAGAG ATGCCCTCCT GCGTCAGACC 240 

ATTGCAGGTT TGGTCAAACG TCAGATGACC TTGGTACTCA AGGATCCCTA TGCTAACTCC 300 
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TTCAACATTG 


AGGAGAACTG 


GAAAGGGCAC 


CACGAGACTG 


ACCACACAGA 


CCTTAACGGC 


360 


TGGATCTGGG 


AGCGCAAGTA 


TGAGGTGGAT 


TCGCTTTGCT 


ATCCTTTGCA 


GTTGGCTTAT 


420 


CTCCTCTGGA 


AAGAGACTGG 


CGAGACTAGT 


CAGTTTGATG 


AGATTTTTGT 


CGCAGCGACT 


480 


AAGGAAATTC 


TCCATCTGTG 


GACGGTGGAA 


CAAGACCACA 


AGAACTCTCC 


TTATCGTTTT 


540 


GTCCGAGATA 


CGGACCGTAA 


GGAAGACACC 


TTGGTAAATG 


ATGGCTTTGG 


ACCTGACTTT 


600 


GCAGTGACAG 


GTATGACTTG 


GTCAGCTTTT 


CGTCCGAGTG 


ATGACTGTTG 


CCAGTATAGT 


660 


TACTTGATTC 


CGTCAAATAT 


GTTTGCTGTA 


GTAGTCTTGG 


GTTATGTGCA 


AGAAATCTTC 


720 


GCAGCATTAA 


ACCTAGCTGA 


TAGCCAGAGT 


GTTATTGCTG 


ATGCCAAGCG 


TCTTCAGGAT 


780 


GAAATCCAAG 


AAGGAATCAA 


AAACTACGCT 


TACACCACCA 


ACAGCAAGGG 


CGAAAAGATT 


640 


TACGCTTTTG 


AAGTGGATGG 


CCTAGGAAAT 


GCCAGCATCA 


TGGATGATCC 


AAATGTACCA 


900 


AGTCTACTAG 


CTGCGCCCTA 


TCTGGGCTAC 


TGTTCGGTCG 


ATGATGAAGT 


GTATCAAGCT 


960 


ACTCGTCGTA 


CCATTTTGAG 


CTCTGAAAAT 


CCATACTTCT 


ACCAAGGAGA 


ATACGCAAGC 


1020 


GGTCTCGGCA 


GTTCTCATAC 


CTTCTATCGC 


TATATCTGGC 


CAATCGCCCT 


TTCTATCCAA 


1080 


GGCTTGACAA 


CAAGAGATAA 


GGCAGAGAAA 


AAATTCTTGC 


TGGATCAGCT 


GGTTGCCTGC 


1140 


GATGGTGGTA 


CAGGTGTCAT 


GCACGAAAGC 


TTTCATGTAG 


ATGATCCGAC 


CCTCTACTCT 


1200 


CGTGAATCGT 


TCTCCTGGGC 


TAACATGATG 


TTCTGTGAGT 


TGGTCTTGGA 


TTACTTGGAT 


1260 


ATTCGCTAAG 


GGGCTCGCTT 


TAGCTCAACC 


GATTCTTATC 


AGAATCACAA 


GTTTACATTT 


1320 


AAAACGTTAA 


AATTTAAATT 


TAGAATGAGG 


TTTTACTTCA 


TGGAAAATGT 


TGTTGTACAT 


1380 


ATTATCTCAC 


ATAGTCACTG 


GGATCGTGAG 


TGGTACTTGC 


CTTTTGAAAG 


CCATCGTATG 


1440 


CAGTTGCTGG 


AATTGTTTGA 


CAATCTCTTT 


GATCTCTTTG 


AAAATG AC CC 


TGAGTTCAAG 


1500 


AGTTTCCACT 


TGGATGGACA 


AACTATTGTC 


CTTGATGACT 


ACTTACAAAT 


TCGCCCTGAA 


1560 


AATCGCGACA 


AGGTCCAACG 


CTACATTGAC 


GAGGGCAAAC 


TTAAAATTGG 


TCCCTTTTAC 


1620 


ATCTTGCAGG 


ATGACTACTT 


GATCTCCAGT 


GAAGCCAATG 


TCt-GCAATAL 


V- I 1 un I I * 


1 fiftO 

luOU 


CAACAAGAAG 


CTGCCAAATG 


GGGTAAATCA 


ACCCAGATTG 


GCTACTTTCC 


AGATACCTTT 


1740 


GGAAATATGG 


GACAAGCGCC 


TCAAATTCTT 


CAAAAATCAG 


GCATTCACGT 


GGCGGCCTTT 


1800 


GGTCGTGGTG 


TGAAGCCGAT 


TGGATTTGAC 


AACCAAGTCC 


TTGAAGATGA 


GCAGTTTACG 


1860 


TCTCAGTTTT 


CAGAAATGTA 


CTGGCAGGGT 


GTGGATGGTA 


GTCGTGTTTT 


AGGTATTCTC 


1920 


TTTGCCAACT 


GGTACAGTAA 


CGGGAATGAA 


ATTCCAGTTG 


ACAAAGATGA 


GGCCTTGACC 


1980 


TTCTGGAAAC 


AAAAATTGTC 


AGATGTGCGT 


GCCTACGCTT 


CGACCAACCA 


ATGGTTGATG 


2040 


ATGAACGGCT 


GTGACCACCA 


GCCTGTACAG 


AAAAATCTGA 


GCGAAGCCAT 


TCGTGTGGCA 


2100 
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AATGAACTCT TCCCGGATGT AATCTTTGTT CATAGTTCTT TTGATGAATA TGTTCAAGCT 2160 

GTAGAAGGTG CGCTTCCTGA ACACTTATCA ACTGTTACAG GCGAGTTGAC CAGTCAGGAA 2220 

ACAGATGGCT GGTACACACT TGCCAACACT TCTTCATCCC GCATTTACCT AAAACAAGCC 2280 

TTCCAAGAAA ATAGCAACCT CCTAGAGCAA GTGGTAGAAC CCTTGACTAT TATCACTGGT 2340 

GGACACAACC ACAAGGACCA GTTGACCTAT GCTTGGAAAA CACTTTTGCA GAATGCGCCA 2400 

CATGATAGTA TCTGTCCCTG TAGCGTGGAC GAAGTTCACC GCGAGATGGA AACGCGTTTT 24 60 

GCCAAGGTCA ACCAAGTAGG AAACTTTGTT AAAAGTAACT TGCTCAACGA GTGGAAGGGT 2520 

AAAATTGCTA CGGATAAGGC TCAAAGTGAC TATCTCTTTA CTGTCATTAA CACAGGCTTG 2580 

CATGATAAGG TCGATACTGT CAGCACAGTG ATTGATGTGG CGACTTGTGA TTTCAAGGAA 2640 

TTGCACCCAA CAGAAGGCTA CAAAAAGATG GCTGCTCTTA TCTTGCCAAG TTACCGTGTG 2700 

GAGGACTTGG ATGGTCGTCC TGTAGAGGCT ACAATCGAAG ACCTCGGAGC TAATTTTGAG 2760 

TATAATTTAC CAAAAGACAA GTTCCGCCAA GCTCGTATTG CTCGTCAAGT GCGCGTGACC 2 820 

ATTCCAGTTC ACCTAGCGCC GCTTTCTTGG ACAACCTTCC AATTGCTGGA AGGAAAACAA 2880 

GAACACCGTG AGGGTATTTA CCAAAACGGA GTGATTGATA CACCATTCGT AACGGTGAGT 2 940 

GTGGATGACA ACATCACAGT CTATGACAAG ACAACTCACG AAGCCTATCA AGACTTTATC 3000 

CGCTTTGAAG ACCGTGGGGA CATCGGAAAC GAGTATATCT ATTTCCAACC AAAAGGAACA 3060 

GACCCAATCT TTGCAGAGCT TAAGGGCCAC GAGGTCTTGG AAAACACAGC TTGCTATGCT 3120 

AAAATCTTGC TCAAACATGA ATTGACCGTG CCTGTCAGTG CGGATGAAAA GCTAGAAGAA 3180 

GAGCAACAAG GTATCATCGA GTTTATGAAG CGTGAGGCTG GACGGTCAGA AGAATTGACA 324 0 

AACATTCCTC TGGAAACTGA GTTGACTGTC TTCGTTGACA ATCCACAAA? CCGCTTCAAG 3 300 

ACTCGCTTTA CTAACACTCC CAAGGATCAC CGTATCCGTC TCTTGGTCAA GACTCATAAC 3 360 

ACGCGTCCAA GCAATGATTC TGAAAGTATC TATGAGGTGG TGACACGACC AAACAAACCA 34 20 

GCTGCTTCAT GGGAAAACCC TGAAAATCCT CAACACCAAC AAGCTTTTGT CAGTCTGTAT 3480 

GACGATGAAA AAGGGGTGAC TGTATCCAAC AAGGGATTGA ATGAATACGA AATCCTTGGG 3 540 

GATAACACCA TTGCCGTGAC CATTTTCCGT GCATCAGCTG AGCTAGGTGA CTGGGGCTAC 3 600 

TTCCCAACGC CAGAAGCACA ATGCTTGCGG GAGTTTGAAG TCGAGTTTGC ACTTGAATGC 3660 

CACCAAGCCC AAGAACGCTT CTCAGCCTAT CGTCGTGCCA AAGCCTTGCA GACACCGTTT 3720 

ACCAGCCTTC AGCTTCCTAC ACAGGAAGGA AGCGTGGTTG CGACTGGTAG CCTCTTGAGC 3780 

CATTCTGTTC TCAGCATACC GCAAGTTTGT CCAACAGCCT TTAAGGTAGC TGAAAATGAA 3840 
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GAAGGCTATG TGCTTCGTTA CTACAATATG TGTAGTGAAA ATGTACGTGT GCCAGAAAGT 3900 

CAACATCTCT TCCTTGACCT ACTTGAACGA CCATACCCAG TTCATTCAGG ACTATTGGCT 3960 

CCACAAGAGA TTCGTACAGA ATTCATCAAA AAAGAAGAAA TTTAATTTCA AAAAGTAAAC 4020 

ATCAAAAGAA AGGAGGGGCG AAAAAGTAAG AACTAACTGC TGATTCGCCC CTTTTATGGT 40 SO 

AAAAACAATG ACCATTGCAA CGATTGATAT CGGAGGGACT GGGATTAAGT TTGCCAGTCT 4140 

GACTCCTGAT GGGAAAATAC TGGATAAGAC AAGTATTTCA ACGCCTGAAA ACTTGGAGGA 4200 

TTTACTAGCG TGGCTAGATC AACGCTTGTC AGAACAGGAT TACAGTGGGA TTGCTATGAG 4260 

CGTTCCAGGT GCAGTCAATC AAGAGACAGG TGTGATTGAT GGCTTCAGTG CGGTGCCCTA 4320 

CATCCATGGC TTTTCTTGGT ATGAGGCGCT TAGCTCTTAT CAGCTACCTG TCCATTTAGA 43 80 

AAATGATGCC AACTGCGTTG GACTCAGTGA ACT ACT AG CT CATCCAGAGC TTGAAAATGC 4440 

AGCCTGTGTC GTGATTGGGA CAGGGATTGG CGGAGCCATG ATTATCAATG GTAGACTTCA 4500 

TCGAGGTCGC CACGGTCTGG GTGGAGAATT TGGCTACATG ACAACCCTTG CCCCTGCTGA 4560 

AAAACTTAAT AACTGGTCGC AACTAGCATC AACTGGGAAT ATGGTACGAT ACGTG ATTG A 4620 

AAAATCTGGT CATACTGATT GGGACGGTCG CAAGATTTAC CAAGAGGCCG CAGCTGGTAA 4680 

TATCCTTTGT CAAGAAGCCA TTGAGCGCAT GAACCGCAAT CTGGCGCAAG GCTTGCTCAA 4740 

TATCCAGTAT CTGATCGATC CAGGTGTCAT CAGTCTGGGT GGCTCTATCA GTCAAAATCC 4800 

AGATTTTATC CAAGGTGTCA AGAAGGCTGT TGAAGACTTT GTCGATGCCT ACGAAGAATA 4860 

CACGGTCGCA CCAGTTATCC AGGCCTGCAC CTATCACGCA GATGCCAATC TCTACGGTGC 4920 

TCTTGTCAAC TGGCTACAGG AGGAAAAGCA ATGGTAAGAT TTACAGGACT TAGTCTCAAA 4 980 

CAAACGCAAG CTATTGAGGT TTTAAAAGGT CACATTTCTC TACCAGATCT GGAAGTGGCT 5040 

GTCACTCAGT CTGACCAAGC ATCTATCTCT ATCGAGGGTG AGGAAGGTCA CTATCAATTG 5100 

ACCTACCGCA AACCTCACCA ACTTTATCGT GCCTTGTCCT TGTTGGTAAC AGTTCTAGCA 5160 

GAAGCTGATA AAGTAGAGAT TGAGGAACAA GCAGCTTACG AAGATTTGGC TTACATGCTT 5220 

GACTGTTCTC GAAATGCGGT GCTGAATGTG GCTTCTGCCA AGCAGATGAT TGAGATATTG 5280 

GCTCTCATGG GCTACTCAAC CTTTGAGCTT TACATGGAAG ACACTTACCA GATTGAAGGG 5340 

CAGCCTTACT TTGGCTATTT CCGTGGAGCT TATTCAGCAG AGGAGTTGCA GGAAATCGAA 5400 

GCCTATGCCC AACAGTTTGA CGTGACCTTT GTACCATGCA TCCAGACCTT GGCCCACTTG 5460 

TCGGCCTTTG TCAAATGGGG TGTCAAGGAA GTGCAGGAGC TCCGTGATGT AGAGGACATT 5520 

CTTCTCATTG GCGAAGAAAA GGTTTATGAC TTGATTGATG GCATGTTTGC CACGTTGTCT 5580 

AAACTGAAGA CTCGCAAGGT CAATATCGGG ATGGACGAAG CCCACTTGGT TGGTTTGGGA 5640 
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CGCTACCTGA 


TTCTGAACGG 


TCTTGTGGAT 


CGTAGTCTCC 


TCATGTGCCA 


ACACTTGGAG 


5700 


CGCGTGCTGG 


ATATTGCTGA 


CAAATATGGT 


TTCCACTGCC 


AGATGTGGAG 


TGATATGTTC 


5760 


TTCAAACTCA 


TGTCAGCGGA 


TGGCCAGTAC 


GACCGTGATG 


TGGAAATTCC 


AGAGGAAACT 


5820 


CGTGTCTACC 


TAGACCGTCT 


CAAAGACCGT 


GTGACTCTGG 


TTTACTGGGA 


TTATTATCAG 


S880 


GATAGCGAGG 


AAAAATACAA 


CCGTAATTTC 


CGCAATCATC 


ACAAGATTAG 


CCATGACCTT 


5940 


GCATTTGCAG 


GGGGAGCTTG 


GAAGTGGATT 


GGCTTTACAC 


CTCACAACCA 


TTTTAGCCGT 


6000 


CTAGTGGCTA 


TCGAGGCTAA 


TAAAGCCTGC 


CGTGCCAATC 


AGATTAAAGA 


AGTCATCGTA 


6060 


ACGGCTTGGG 


GAGACAATGG 


TGGTGAAACT 


GCCCAGTTCT 


CTATCCTACC 


AAGCTTGCAA 


6120 


ATCTGGGCAG 


AACTCAGCTA 


TCGCAATGAC 


CTAGATGGTT 


TGTCTGCGCA 


CTTCAAGACC 


61S0 


AATACTGGTC 


TAACGGTTGA 


GGATTTTATG 


CAGATTGACC 


TTGCCAACCT 


CTTACCAGAC 


6240 


CTACCAGGCA 


ATCTCAGCGG 


TATCAATCCC 


AACCCCTATG 


TTTTTTATCA 


GGATATTCTT 


6300 


TGTCCGATTC 


TTGATCAACA 


CATGACACCT 


GAACAGGACA 


AACCGCACTT 


CCCTCAGGCT 


6360 


GCTGAGACGC 


TTGCTAACAT 


TAAAGAAAAA 


GCTGGAAACT 


ATGCCTATCT 


CTTTGAAACT 


6420 


CAGGCCCAGT 


TGAATGCTAT 


TTTAAGTAGC 


AAAGTAGATG 


TGGGACGACG 


CATTCGTCAG 


6480 


GCCTACCAAG 


CGGATGATAA 


AGAAAGTTTA 


CAACAAATCG 


CCAGACAAGA 


ATTACCAGAA 


6540 


CTTAGAAGCC 


AAATTGAAGA 


CTTCCATGCC 


CTCTTTACCC 


ACCAATGGCT 


GAAAGAAAAC 


6600 


AAGGTCTTTG 


GTTTGGATAC 


AGTTGACATC 


CGTATGGGCG 


GACTCTTGCA 


ACGCATCAAA 


6660 


CGAGCAGAAA 


GCCGTATCGA 


GGTTTATCTG 


GCTGGTCAGC 


TTGACCGCAT 


CGACGAGCTC 


6720 


GAAGTTGAAA 


TCCTACCATT 


TACTGACTTC 


TACGCAGACA 


AGGATTTCGC 


AGCAACTACA 


6780 


GCCAACCACT 


GGCATACCAT 


TCCGACAGCC 


TCGACGA7TT 


ATACGACTTA 


ATATTCTTCG 


6340 


AAAATCTCTT 


CAAACCACGT 


CAGCTTCCAT 


CTGCAACCTC 


AAAACAGTGT 


TTTGAGCAAC 


6900 


CTGCAGCTAG 


CTTCCTAGTT 


TGCTCTTTGA 


TTTTCATTGA 


GTATAAAAAC 


AAGAACACCT 


6960 


TGCTTGGCGC 


AGGGTGTTTC 


GCGTGAAACA 


GAAGAATTAT 


CTGGTTTCAA 


ATGCTACAGT 


7020 


TAGACAAACT 


TATGATAAAA 


TAGCAGAAAG 


TGAATGTTTC 


CTAAGAGCAA 


TTGGAGGTAT 


7080 


TATGCTACAC 


TTAAAATTAG 


TAAAACAAGA 


AATAGAAGCT 


GAAAAGCCAG 


CATCTGTAGA 


7140 


AGCTTGGATC 


ATTTCCGTCA 


AATTTAAAAA 


AGGTTGCTAC 


CGACATATAT 


AGATTCCAAA 


7200 


AACAAAAACG 


TTAGCGGAAC 


TAGCAGATGT 


GATTTTATGG 


AGTTTTGATT 


TTGCAAATGA 


7260 


TCATGCTCAC 


GCATTTTTCA 


TGGATAATGT 


TGAGTGGAGT 


CATGCAGATT 


CTTACTTTCG 


7320 


TAGCTTTGTT 


AGTGACGATG 


TTGAAGAACG 


TTACACAGAA 


AATGTCTATC 


TGGATAGCCT 


7380 
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AAGTGTCAAA CAAAAATTTA AGTTTATTTT CGACTTCGGT GA^G^.TCXJC GTTTTGAATG 7440 

CCAAGTGCTG AGAGAAATCG AGACAGAGGA CGAAGAAGCT TATCTCGTAC GTTCGGTTGG 7500 

AACGTCGCCA GAACAATATC CAGATTATGA TGGTTTTGAC TATGAAGAAT GGTAAAATTG 7560 

AAATCAGTCT GTGTAGGCTT AGTATTTCAA TAGACTTCCT GCAAAACTAG AATCCTAGTT 7620 

CATGATTGAT AATACCAGCA ATCAAATTCA TTCGTAATCC GAAGCGTTTA CGATGATTTC 7680 

GATAGGTTGT TGAAAACATT TTAAACGTTT TTACTTTGGC AAAGATGTTC TCAACCTTGC 774 0 

TTCTCTCCTT AGATAGCGCA TGGTTATAGG CTTTATCTTC AGCTGTTAGT GGCTTGAGTT 7800 

TGCTGGATTT ACGTGAAGTT TGTGCTTGAG GACATATCTT CATGAGCCCT TGATAACCAC 7860 

TGTCAGCCAA GATTTTACCA GCTTGTCCGA TATTTCTCCA ACTCATTTTG AACAACTTCA 7920 

TATCATGACA ATAGTTCACA GTGATATCCA AAGAAACAAT TCTCCCTTGA CTTGTGACAA 7980 

TCGCTTGAGC CTTCATAGCG TGAAATTTCT TTTTACCAGA ATCATTCGCT AATTCTTTTT 8040 

TTAGGGCGAT TGATTTTTAC TTCCGTCGCA TCAATCATTA CCGTGTCCTC AGAACTAAGA 8100 

GGAGTTCTTG AAATCGTAAC ACCACTTTGA ACAAGAGTTA CTTCAACCCA TTGGCTCCGA 8160 

CGGATTAAGT TGCTTTCGTG AATACCAAAA TCACCCGCAA TTTCTTCATA AGTCCGGTAT 8220 

TCTAGGCTTA ATTTAGGTTT TCGTCCACCT TTTGCGTGTT TAAGTTGATA AGCTCTTTTT 82 80 

AATACAGCTA ACATCTCTTT AAAAGTCGTG CGCTGAACAC CAACAAGACG CTTAAATCGT 8340 

CTATCAGTTA ATTGTTTACT TGCTTCATAA TTTCGCAGGG ACTCTATTGA CTCTTTGGTA 8400 

GGTGTCAATG TTTTTTTCAT CTATCCCGAG AATTATTTTC CCCCCATTTG TATTTGCAAA 84 60 

TGCTGAGTAG GTTTCCCAGA AAGACTCTGG AAGATTGTTT TTAGCTTTTT TGTATTCTAA 95 20 

ATCAACCCCT TCAAATTTTA AGTCCATATT TTTCCTTTAC ATCTGTTTTT TGTGGTTCTC 8580 

GTATTTGTTC AAGTTGAGTG ATAATATAGC GAATTGAATT TCGAGAGTTT TTACTCAGTT 8640 

AATTTCTTTT TTAACCC 8657 

(2) INFORMATION FOR SEQ ID NO: 45: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11384 base pairs 
(B> TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

TCTATTTTGG CTATAGACTT ACCTATAAAG AAAAATATCT ATACACTGCC TTACTAGCTA 60 

TACTGAACGA GTCAACAAAA ACGATATATA TTGATGATAT AAATACAGCA AGATTTTTTA 120 
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ACTTCTTTGG 


CAATGATATT 


CCTAATTCGT 


CTTTAAAAAA 


AATTGACTAT 


ATCGCACCTT 


ISO 


CAGAAATTGT 


TTCATTTAGT 


ACGTACGTTC 


GACAACGTTC 


TAAAGTAATT 


CCTAAAATTT 


240 


TGGAACATAT 


ATTAAAATCA 


AGTTTTTTAT 


TAGAGAATAT 


AGATGTTTCT 


GGTTACACTG 


300 


TAAATATTTT 


ACAAGATCAA 


TTAACAAAAC 


ATAGAACAAT 


CAAAATTAGT 


AAAAACTAAC 


360 


TGGTTGATCT 


CATGTATAAA 


TACCTAACAA 


AACCACGCGC 


CTTGCCTGCT 


GATGGAAAGA- 


420 


AAGGTACAAA 


TACATGAATA 


TCAAAGAAAA 


AATCAAAAAG 


AATGGCCAAA 


GAGTTTATTA 


480 


TGCTAGTGTT 


TATCTAGGCG 


TTGACCAACT 


AACGGGCAAA 


AAAGCCCGTA 


CAACTGTTAC 


540 


AGCAACCACT 


AAAAAGGGCG 


TTAAAGTAAA 


AGCGCGTGAT 


GCGATCAATA 


CTTTTGCTGC 


600 


TAATGGCTAT 


ACAGTTAAAG 


ACAAGCCGAC 


AATTACAACA 


TATAATGAGC 


TTGTAAAAGT 


660 


TTGGTGGGAT 


AGTTACAAGA 


ATACAGTTAA 


GCCAAATACT 


CGCCAATCCA 


TGGAGGGATT 


720 


GGTTAGAGTG 


CATTTATTGC 


CTGTATTTGG 


CGATTACAAG 


CTATCTAAAC 


TTACTACGCC 


780 


TATTCTTCAA 


CAGCAAGTAA 


ACAAATGGGC 


TGACAAGGCA 


AATAAAGGCG 


AAAAAGGGGC 


840 


ATTTGCTAAC 


TACTCTTTGC 


TCCATAACAT 


GAATAAGCGT 


ATTTTGAAAT 


ATGGCGTAGC 


900 


TATCCAGGTA 


ATACAATACA 


ACCCAGCTAA 


TGATGTCATC 


GTTCCACGCA 


AACAGCAAAA 


960 


AGAAAAGGCT 


GCTGTCAAAT 


ACTTAGACAA 


CAAAGAATTA 


AAACAGTTTC 


TTGATTATTT 


1020 


AGATGCTCTG 


GATCAATCAA 


ATTATGAGAA 


CTTATTTGAT 


GTTGTTCTGT 


ATAAGACTTT 


1080 


ATTGGCCACT 


GGTTGCCGTA 


TTAGTGAGGC 


TCTGGCTCTT 


GAATGGTCTG 


ATATTGACCT 


1140 


AGAAAGCGGT 


GTTATCACCA 


TCAATAAGAC 


ACTAAACCGC 


TATCAGGAAA 


TAAACTCACC 


1200 


TAAATCAAGC 


GCTGGTTATC 


GTGATATACC 


AATAGACAAA 


GCCACATTAC 


TTTTACTGAA 


1260 


ACAATACAAA 


AACCGTCAAC 


AAATTCAGTC 


TTGGAAATTA 


GGCCGATCTG 


AAACAGTTGT 


1320 


ATTCTCTGTA 


TTTACGGAGA 


AATATGCTTA 


TGCTTGTAAC 


TTACGCAAAC 


GCCTAAATAA 


1380 


GCATTTTGAT 


GCTGCTGGAG 


TAACTAACGT 


ATCATTTCAT 


GGTTTCCGCC 


ATACACATAC 


1440 


TACTATGATG 


CTCTATGCTC 


AGGTTAGCCC 


GAAAGATGTT 


CAGTATAGAT 


TAGGCCACTC 


1500 


TAATTTAATG 


ATCACTGAAA 


ATACTTACTG 


GCATACTAAC 


CAAGAGAATG 


CAAAAAAAGC 


1560 


CGTCTCAAAT 


TATGAAACAG 


CTATCAACAA 


TTTATAAAAA 


ATAAGGGTGA 


CCCATTTCCG 


1620 


GGCTACCCTC 


TTACTATACC 


AAAAATTAGT 


AGGGGTAGTA 


AAAAGGGTAT TAAATTATAA 


1680 


AAAGCACTAA 


GGGAAAGCGC 


CCCAAAGTGC 


TTATTTCAAA 


GGCTTTATAG 


CCTATAATCA 


1740 


CATAAAGAGA 


TTATTTTTTA 


AGGTTGTAGA 


ATGATTTCAA 


TCCACGATAT 


TCAGCTACTT 


1800 


CACCAAGTTG 


GTCTTCGATA 


CGAAGCAATT 


GGTTGTATTT 


AGCGATGCGG 


1 TCTGTACGTG 


1860 
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AAAGTGAACC AGTCTTGATT TGTCCTGCGT TAGTTGCAAC TGCAATATCA CCGATTGTTG 1920 

AATCTTCAGT TTCACCTGAA CGGTGTGATA CAACAGCAGT GTAACCAGCT TCTTTAGCCA 1980 

TTTCGATAGC TTCAAAAGTT TCAGTAAGAG TACCGATTTG GTTAACTTTG ATAAGGATTG 2040 

AGTTAGCAGC ACCTTCTTGG ATACCACGTG CAAGGTAGTC AGTGTT TGTT ACGAAGAAGT 2100 

CGTCACCAAC AAGTTGTACT TTCTTACCAA GACGTTCAGT AAGAGCTTTC CAACCATCCC 2160 

AGTCGTTTTC ATCCATACCA TCTTCAATAG TGATGATTGG GTATTTGTTA ACCAATTCTT 2220 

CAAGGTAGTC GATTTGTTCT GCAGATGTAC GAACAGCAGC ACCTTCACCT TCAAATTTAG 22 80 

TGTAGTCGTA AACTTTACGT TCTTTATCGT AGAATTCTGA TGAAGCACAG TCAAATCCGA 2340 

TAAATACGTC TTTACCTGGT ACATATCCAG CAGCTTCAAT CGCAGCAAGG ATAGTTTCAA 2400 

CACCATCTTC AGTTCCTTCG AAACGAGGAG CGAATCCACC TTCGTCACCT ACGGCAGTTT 24 60 

CCAAACCACG TGATTTAAGG ATTTTCTTAA GAGCGTGGAA GATTTCAGCA CCGTAACGAA 2520 

GGGCTTCTTT AAATGTTGGC GCACCAACTG GCAAGATCAT GAACTCTTGG AAAGCGATTG 2580 

GAGCGTCAGA GTGAGAACCA CCGTTGATGA TGTTCATCAT TGGAGTTGGA AGAACTTTAG 2640 

TGTTGAATCC ACCAAGATAG CTGTAAAGTG GGATTTCAAG GTAGTCAGCA GCACCACGAG 2700 

CTACAGCGAT AGACACACCG AGGATTGCAT TCGCACCCAA TTTACCTTTG TTAGGAGTAC 2760 

CGTCAAGTGC GATCATAGCA CGGTCAATAG CTTGTTGATC ACGTACATCG TAGCCAATGA 2820 

TAGCTTCAGC AATGATGTTG TTTACGTTGT CAACAGCTTT TTGTGTACCA AGACCACCGT 2880 

AACGAGATTT GTCACCGTCG CGAAGTTCAA CTGCTTCGTG TTCACCAGTA GAAGCTCCTG 2940 

ATGG AAC CAT ACCACGTCCG AAAGCACCTG ATTCAGTGTA AACTTCTACT TCAAGTGTTG 3000 

GGTTACCGCG TGAGTCTAGG ACTTCGCGAG CGTAAACATC AGTAATAATT GACATTTTTP 3060 

ACTCTCCTTA TGAGTTAAAT TTTTTACACC TCTATAATAC CTTAAAACCC CTC C TT TTT C 3120 

AAGAAAAAAC GTTATCTTTG TGCAACTTTT CCTTAACTTT ATAAAGTAAT CGCTTTCT T T 3180 

TGTCTGTTTT ATTCTAACTT TTATGATATA CTGTTTTCAT GACAGATTTA TCAAAACAAT 3240 

TACTTGAAAA AGCTCATGGT GGGTTAAAAA TAAATCCGGA TGACCAAAGA CGCTATCTTG 3300 

GTACTTTTGA GGAAAGAGTT CTTGGATATG TAGATATTGA CACAGCAAAT AGCCCTCAGT 3360 

TAGAAAAAGG CTTTTTATTT ATTTTAGAAA ACCTTCAGGA AAAAGCAGAG CCACTATTTG 3420 

TGAAGATTTC ACCAACTATC GAATTTGATA AGCAAGTTTT CTACTTAAAA GAAGCAAAAG 3480 

AAACTGATAG TCAAGCCACC ATAGTATCTG AAGAGCATAT TACTTCTCCT TTTGGCCTGG 3540 

TTATTCATAG CAATGCACCA GTTCAAGTAG AAGAAAAAGA CCTTCGACTT GCTTTTCCAA 3600 

AACTTTGGGA AGTTAAAAAG GAAGAACCAG CCAAAACATC CTTATGGAAG AAATGGTTTA 3660 
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GCTAAATCTT 


GCACATATTT 


AATAAGTGCC 


CAATATTGGC 


AGCCGTGCGC 


TCCAGATAGA 


3720 


AACTGGCATT 


TTTCAAACTA 


TCTTCTAAAG 


GTTCACTTTT 


CTCCAAAATA 


GAAAAGACAG 


3780 


CTTGGATATT 


TTCAAATGGT 


AGGGGAGGTA 


AATCTTCAGC 


AAGACTACCG 


CAAATAGCAA 


3840 


TAACAGGAAC 


TCCAACAGGG 


GTTCTTTTTG 


CAACACCTAT 


AGGCGCTTTC 


CCAGCAAAGC 


3900 


TTTGACTATC 


AAGTCTTCCT 


TCTCCAACAA 


CAACCAAGTC 


AGCATCTGAA 


ACTTTCTTAT 


3960 


CAAAGTTCAT 


TAAGTCCAAG 


CAGGTATCAA 


TTCCAGACAC 


GATACTTGCC 


TGAGCAAAGG 


4020 


CACACAAACC 


ACCAGCAAGG 


CCTCCACCTG 


CTCCTGCTCC 


TTTAATTTCT 


AATGTTGCAC 


4080 


GTGAGAATTT 


TTCATAAAAA 


TCTTGGATCG 


CCTGATCTAC 


GACTGCAAAC 


ATAGTCGGAT 


4140 


GTAGACCTTT 


TTGATTGCCA 


AAAGTGTAAG 


TCGCACCTTG 


ATGACCACAT 


AAGGGACTCA 


4200 


CGACATCTGC 


TAAAATATGA 


ATTTGAACAC 


CTTCAGGAAT 


TTTATAGCAA 


TTTTCTGTTG 


4260 


AAACAGAAGC 


TAAGTTTAAT 


AAGGATTGAC 


CGGAAGCAGG 


CAAGACATTT 


CCATCCCTAT 


4320 


CATAAAATTG 


ATAACCTAAA 


CCAGCAGCAA 


TCCCCAGTCC 


TCCATCATTA 


CTGGCCGTGC 


4380 


CACCAACACC 


GATATAAATA 


TCTTTAATCC 


CTTTAGAGAT 


GAGATGAAGA 


ATCAACTCTC 


4440 


CAATACCACA 


AGTTTGGATT 


TCAAGTGGAT 


TTCGTTTCTC 


TAGCGGAATT 


TTTCCAACAC 


4500 


CAACCAAGTC 


AGCTACTTCA 


AATAGTGCCA 


GTTCCCCTTT 


TTGAAAATAG 


CGCATGGCTT 


4S60 


CTTTTTGTCC 


AAAAGGGTCT 


GTCACTTGGA 


TCCATTTTTC 


TTTTAGGTCA 


AGAGAATGTC 


4620 


GGATAGCATC 


TACAGTACCT 


TCTCCCCCAT 


CACCAACAGG 


GCAGAGGAGA 


CATTCTACAT 


4680 


CTGCTATCGA 


TTGTTGGAAG 


CCTCTTTTTA 


TTGCTTCAGC 


TACCTGTTGA 


GCTGTCAACC 


4740 


TTTCCTTAAA 


CGAATCCCGT 


GCAATTACAA 


TCTTCATATT 


TTCCCTCATT 


CTAAACAGTC 


4800 


AATCAAAGGG 


AGAACTTCTA 


AAAAATCCCT 


CTTGTCAACA 


TGATGTGGTA 


TTTCTTTTTT 


4860 


GAGCACTTCT 


TTGGCACAAA 


AGGCGATTCC 


TAACTTCGCC 


GACTTCAACA 


TTAATAGATT 


4920 


ATTAACCCCA 


TCACCGATTG 


CCACCGTTCT 


TTCTTTAGAA 


AGTTTTAGTT 


TCTTTCTCCA 


4980 


TTTTTCCAGA 


GTCTCTTTTT 


TGACCTGGGG 


ACTTATAATT 


TGTCCAACTA 


ATTTTCCTGT 


5040 


TAAAAGACCT 


TCTTTGACTT 


CAAGCTAGTT 


GGCAGTGAAA 


TAGGCAATAC 


CAAGGGATTT 


5100 


TGCTAATCTC 


TCCAACTATT 


GGTGTAAATC 


CACCAGACAC 


CAGACCAACT 


AGGATGCCAT 


5160 


TCTTTTGGAG 


AATAGAGATG 


AACTCTGGGA 


CATTTAGCGA 


TAGATGAATT 


GAGTTGAACA 


5220 


CGTTATCAAA 


GACCAAAATA 


GGAAGACCTT 


CCAACAAGGA 


CACTCTTTTT 


CTTAAACTGC 


5280 


TTTCAAAGAC 


CAACTCTCCT 


CGCATTGCTC 


GACTTGTAAT 


CTGCGAAATT 


TCCGCCTCAT 


5340 


GACCTGCCTC 


TCTCCCTAAA 


AGATCAATCA 


CTTCTTCTAG 


GATTAAGGTT 


CCATCTACAT 


5400 
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CCAAAACACA CAAGCCTTTT ACTTGAGACA TCAGTTCTCC TCTCTAAACA GCCTAAAAAT 5460 

CGTATGAAGT CATCATACGA TTTTATCTAT TAATTAACTA AACTATGGTA CAAGTCAAGG 5S20 

TATGACTTGC AGGCTGTATC CCATGAGAAG TCACTCTCCA TAGC TT GT T T TTGTAGGTTT 5S80 

CTCCAAATGT CTGGATGGTT TCTATACAAG TCCAATGCTG TTTGGAAAGT CCAATTTAAC 5640 

CAATAAGGAG ATAGATTGTC AAAGCTAAAG CCAGTACCGC TTCCTTCGAT TGGATTGAAA 5700 

GCGCGAACTG TATCTCGCAA GCCTCCAACT TCATGGACCA ATGGCAAGGT TCCATAACGC 5760 

ATAGCCATCA TTTGAGACAA GCCACACGGT TCAAAACGAC TTGGCATGAG GAAGAGGTCA 5820 

CAAGCAGCGT AGATTTCCTG AGCAAGTTTG ACATCAAAAG TGATATTTGT TGATAGCTTG 5880 

TCTGGGTAAA TCTGAGCAAA CCATGAGAAA GCTCCTTCAA AGGCTGGATC GCCAGTTCCC 5940 

AAAAGAACAA TCTGAACATC TTCTTGCAAG ATATGGTGAA GACTTTCGAC CACCACATCA 6000 

AAACCTTTTT GACGTGTCAA ACGAGAAACA ATTCCCACCA GTGGAACGTC TGCTCTAACA 6060 

GGCAAGCCAA CTCTTTCTTG CAATTTTGCC TTATTTTTGG CTTTCCCAGA CAAATCTTCC 6120 

TGATTGAAAT GATAGTCTAA AAGAGCATCC GTCTGAGGAT TATAAAGATC AGCATCAATC 6180 

CCATTCACGA TACCAGATAC TTTACCAGAC TCCATTTTAA GAATCTGATC CAAATTACAT 6240 

CCAAACTGAC TAGTCATAAT TTCATGAGCA TAGCTAGGTG AAACGGTTGA AACACGGTTC 6300 

GCATAGAGAA TACCTGCCTT CATCCAGTTC AGACACTTGT TCCATCGAAG GGTGCCATCA 6360 

GCGTAACGTT CAAAGCCAAC TCCAAACAAA TCACCCAACA TTCCTTCTGA AAATTGTCCT 642 0 

TGGAATTCTA AATTATGAAT GGTTAAAACT GTTTCAATGT CCTCATAGGC TTGAATCCAA 6480 

CGGTATTTTT CCTTCAACAA GAAAGGAATC ATAGCTGTAT GGTAGTCATG AACATGGAGA 6540 

AGATCAGGAA TAAAGTCAAT CCTTTCCATA CCCTCAATGG CAGCCAGTTG GAAAAAGGCA 6600 

AAGCGTTCTC CGT CATC AAA ATCACCGTAA ACATGACCAC GGAAGAAATA ATATTGATTG 6660 

TCAATAAAGT AGAAGGTTAC ACCATTTAAT ACTGTTTTCT TAATTCCACA ATACTGTCTG 6720 

CGCCAACCAA CGCTCACCTC AAAATGAAGC ACATCTTCAA TCTGATTTCC AAATTTAGCC 6780 

TCTACCATAT CATAGTAGGG TAAAATCACT GCAACTTCGT GCCCAGCTTT TACCAGTGAT i.840 

TTTGGAAGAG CGCCAATGAC GTCTCCCAAA CCACCTGTTT TTGAAAAGGG TGCACCCTCT 6900 

GCTGCTACAA ATAAAATTTT CATGAATGAA TATCCTCTGT TACTTTAGCA CCTTTCTTAA 6960 

CCACAACTGG ATGTTCTGCA GTTCCTCGAA TCACAACACC ATGCTCAACT TCAACCCCTT 7020 

TGTCCAAGAT AGCATATTCG ACCTGAGCCC CTTCTCCAAT AACAACACGA GGGAAGAGCA 7080 

GGCTATCTTT AACCAAGCTA TCCTTATGGA CATGAATATT ACGTGATAGA ACAGAATTAG 7140 

CTACTTGACC TTCAATAATA CTACCAGAGG CAAACTGAGA AGTGCTTACC TTAGATGTAT 7200 
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TAGCATAGTA AGTTGGCTCT TCGTTTTTGA CCTTTGTATA AATCTTTTGG TTTGGTGAGA 7260 

AAAGAGAATA GAATTTTTGT GATTCAAGCA TATCGATATT CGCTTGATAA TAAGATTTAA 7 320 

CAGAGTGAAT ATTGGCTAGA TAGCCCGTGT ACTCGTAGGC GAAAGCTCCC TCTTTTACAG 7 380 

CCAAATCCCG TAAAACATAG CGCAATTTCT CTGGATGTTC TTTTTTAGCT TCTTCTTCCA 7440 

AGTGTTCAAT CAACCAAGGT GTATCAACGA CAAAGATATC TGTAGACATA TTGAACGTTT 7500 

CAGCTGTTGA CTTGCTATCA AAGAGTTTAT GAGAAAGAAC ATGGTCTGTT TCATCTACAT 7560 

CCAAGATTGC ATTTACTTCT GAAATATCTT TCTTAGCTAG TTTTTTATAA ACTACAGTGA 7620 

TAGGCTCTTT TGTTGTACTA TGTAGGTGGA AAACTTGGTT CAAATCAATG TTAATAAGAA 7 680 

CATCGCAGTT GAGGGCAACC GTTTGGTTTG AGCCAGAACG TTTCAAATAA GTAAGAAGCT 7740 

GTTGGTAGTA TTCTTTTCCA ACTGTACTAC TTTCTACACG GGTATTGTAA ATTCCTAGAT 7800 

AGTAATGGCT AAGAAGGGTT GATAAGCCCC ACTCGCGTCC TGAACGAATA TCGTCAAATA 7 860 

CTGAGCTGAT ATTATCCTGC TGGAAAATAC CAAAGACACT ACGAACACCT GCATTAGCAA 7 920 

GGCTTGAAAG TGGGAAGTCA ATCAAACGAT ATTTCCCACC AAATGGCAAA CTTGCTACTG 7980 

GACGGTGGTC CGTCAATGTC GACATATTGT GAAAACCAAC TGTATTTCCT AAAATCGCAG 8040 

AATATTTATC AATCTTCATC TGTTGCTACC CCCACTACTT CATTATATCC TACAACTTGT 8100 

ACTTCATCTG TTCCATCAAT TTCGACACCG TCAGAAATAA TCGCACCTTC ACCAATAATG 8160 

GCACGTTTAA TCTTAGCTCC TTGACCAATG ATAGCTCCAC TCATGATAAC TGAATCAAGG 8220 

ACTTCCGCTC CTTCGCGAAC TTGCGCGCCT GTTGAAAGGA TAGAATCTTT AACAGTTCCA 8280 

TCAACGAAAC ATCCGTCTAC AACTAATGAG TCTTCCACAT GAGCATTTGC CCCGAGGAAG 8340 

TTTGGTGGTG AAATCAAGTT TCTTGAGTAA ATCT7CCATT GACGGTTACG ACTATCCAAG 8400 

GCATTTT C TG GAGAAATATA CTCCATGTTC GCTTCCCAAA GTGACTCAAT AGTACCAACA 8460 

TCTTTCCAAT AACCACTAAA TTCGTAAGCA TAAACACTTT CACCTGACTC AAGGTAATTT 8520 

GGAATGACAT TTTTACCAAA GTCTGACATG CCAACCTTGC TCTTTTCAGC AGCGACTAAC 8580 

ATATTACGAA GGCGTTGCCA ATCAAAAATG TAGATTCCCA TAGAAGCTTT TGTAGATTTA 8640 

GGTTGAGCTG GTTTTTCTTC AAATTCAACA ATACGATTGT TAGCATCTGT GTTCATGATA 8700 

CCAAAACGGC TTGCTTCTTT AAGAGGGACG TCTAAAACTG CTACTGTCAA GCTGGCATTA 8760 

TTATCCTTAT GAGACTGGAG CATATCATCA TAGTCCATTT TGTAGATGTC ATCCCCAGAC 8B20 

AAAATCAAGA CATACTCAGG ATTGACACTG TCGATATAGT CGATATTTTG GTAAATAGCG 8880 

TGACTAGTCC CCTCAAACCA ACGATTTCCT TCACTTGCAG AATAAGGTTG AAGAATAGAG 8940 
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ACACCTGAAT 


TAATACCGTC 


TAGTCCCCAG 


CTTGAACCAT 


TCCCAATATG 


GTTGTTGAGA 


9000 


GCAAGTGGTT 


GATACTGTGT 


AACGACCCCA 


ACATTGTGAA 


TCCCTGAGTT 


GGCACAGTTT 


9060 


GATAGGGCAA 


AGTCAATGAT 


ACGGTAGCGC 


CCACCAAATT 


GCACAGCTGG 


TTTTGCGATG 


9120 


CTTTGAGTGA 


GTTTACCGAG 


ACGAGTTCCT 


TGCCCACCAG 


CAAGAATCAA 


AGCTAACATT 


9180 


TCATTTTTCA 


TTTTCTACTC 


CTTTTTGGTT 


TTTATTTGTG 


ACGGTTTTAG 


TAGATTTCAA 


9240 


GCGACGTTTG 


ATTTTCCATA 

• % A A A A n * * » 


CACTTGCTCC 


CATAGCCGGT 


AGGGTAAAGG 


TTAAGGTCTG 


9300 


CTCATAATCT 


TTCCATAGTC 


CTTCTTGCGT 


TTGAACAGTT 


TGATTATGTT 


CTTTCCAAAC 


9360 


GCCTCCCCAC 


TCTTCCAACT 


CAGTATTCCA 


TACTTCTTCG 


TAAATTCCTG 


CAACGGGTAG 


9420 


TCCGATTGTA 


AAATCTTT C C 

****** * A A * 


GCTCAACAGG 


TACCATATTA 


AAGATACAGA 


CTAACATTTC 


9480 


aw^i^a 4 4 4 in 


CCCTTACGAA 


TAAAGGAAAG 


AACACTCTGG 


TCTCGATTAT 


CCGCATCAAT 


9540 


GATTTCAATA 


CCATCATAGC 


TGGTATCAAT 

4 *J%^ A •* * • 


TTCCCACAGA 


CAGCGATGAT 


CTTTGTAAAA 


9600 


CTGGTTTAGC 


TGAGAAGCGA 


AATACTTCAT 

vW* A #a*v A A ^» •» A 


CTTAGCATTC 


ATTGGGTCTT 


CTAGGTTAGA 


9660 


CCATTCCAAC 


TGTTCTTCAG 

A VJ ii A V* t & #**J 


ATTTCCATTC 

• • AAA • m A V* 


TAGGAATTGA 


CCGTATTCGC 


TACCCATGAA 


9720 




TTAPCACGCT 
i a *a v» \j a 


GACAAATTTG 


GTACGTATAG 


AGATTGCGCA 


AGCCTGCGAA 


9780 




CGATCTCCCC 


ACATCTTATG 

4b*v~ •» A «w m A *• A W 


CATCATACTC 


TTCTTGCCAT 


GAACCACTTC 


9840 




AATnnCAAHA 


GATAATTCTC 

w« a nn a » a 


CTTGAAAACA 


TACATAAAGC 


TGAAAGTCAC 


9900 


CAGnTTAAAG 


TCATATTTAC 

A ^™ *» A ■» AAA 


GATAGATCGG 


ATCTTCTTCG 


TAGAAACGGA 


GGATATCATT 


9960 




ATGTTCGATT 

-TV A W A A \»A A A 


TGTAGTCAAA 


TCCTAGACCA 


CCAATCTCTT 


TCATTCCCGT 


10020 


AATCTTGATC 


GCAGACGAAC 


TTTCTTCTGC 


AATCATCATC 


ACATCTGGAT 


ATTCTAACTT 


10080 


AATAACCTCA 


TTCAAGCGCT 


GAAGGAAATA 


ATAACCTTCA 


TAGTTGAGAT 


TTCCGCCATC 


10140 


TTTATTAGGT 


GTCCATGGAG 


CATCATCATA 

Wn A * ■ A *>*4* A «A 


GTCCAAATAG 


AGCATGTTGC 


TAACAGCATC 


10200 


CACACGAATA 


CCATCCAAAT 


GATAGACATC 


AATCCAATGC 


TTAATGCAAG 


AAATTAAGAA 


10260 


GGACTGGACT 


TCATTTTTTC 


CAAGGTCAAA 


ATTAAGGGCA 


CCCCAACCAT 


GGTTATGAGC 


10320 


CTTATTATGG 


TCTTGGTATT 


CAAAAGTCGG 


TGTCCCATCA 


TAATAGGCTA 


AGGCATCATC 


10380 


GTTGATGGTA 


AAGTGACTGG 


TACCCAGTCC 


ACAATAACCC 


CAATATTATG 


GGTATGACAC 


10440 


TCCTCGACAA 


AATCTTGAAA 


CTCCTCTGGT 


CGGCCATAAG 


CATGCTCTAA 


AGCGAAGTAA 


10500 


CCCATAAGCT 


GATACCCCCA 


ACTCAAGCCC 


AAAGGATGGG 


ACATCAAGGG 


CATAAACTCA 


10560 


ATATGAGTAT 


AGTTCATTTC 


AACGAGATAA 


CGAATGAGTT 


CATCCTTGAG 


CTGGGCAAAA 


10620 


CTATAAGGAC 


TGCCATCAGA 


ATTTCTTTTC 


CATGATCCAG 


CGTGAACTTC 


ATAAATATTG 


10680 


ACAGGACGCT 


CTTCAAAGCC 


CCAACGTTTT 


CTTCGTGCCA 


GCCAAAGTCC 


ATCCTTCCAT 


10740 
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TTCTTCTCAG GAAGCTCTGT TACGATTGCC CCTGTTCCTG GACGAGCCTC ATACCTGACA 10800 

GCAAAAGGGT CAATCTTCAT CAGTTGATGA CCATTTTGAC GTGTGACATG ATATTTGTAA 10860 

ATATGCCCTT CTTGAGCCAT ATTGGTAAAG ACTTCCCAGA CCCCAAAATC ATTTCTTACC 10920 

ATTGGAATCT GATTTTCAAT CCAGTTGGTA AAATCACCAA CCAAGTGAAC AGCCTGAGCA 10980 

TTAGGTGCCC AAACACGGAA GGTATAGCCA TGCTCTCCAT TTAGTTCTTC CCTATGTGCT 11040 

CCTAGATAAT GTTGGAGATA AAAATTTTCA CCCGTCATAA AGGTTTTTAA TGCTTCTCTA 11100 

TTATCCATAT ACTCCCCTTC TCCTGTAAGC GTTTTCTATG TTTTTATTAT ACTACCTTTT 11160 

TAGAGAAGAT TCAAGTAAAT TACTATACTT CTTTAATTAT TTTGAAAATC TACAACAAGT 11220 

TCACTTACTC GTTCAATTGT AAATCAATAT TTTTTCAAAA AATTGCGAAA ACGCCTTTCT 11280 

TTTTCTACTA TAGTGAAATG AAATAAAACA TGCGCAAATC GATTAAGGAA TTTAATCTAA 11340 

TTTCTAACAA TGTCTTAGAA ATCAAAGTGT ACTATTTTAA CTCC 11384 



(2) INFORMATION FOR SEQ ID NO: 46: 

(ii SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7577 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

TGTTGATTTG TTACTAGACG TTGACCAACG TCCTTCCGCT GGAAAAGGAA TTCTCCTTAG 60 

TTTCCAACAC GTTTTCGCCA TGTTTGGTGC GACCATCTTC GTACCATTCA TTTTGGGAAT 120 

GCCTGTATCT GTTGCCCTTT TTGCTTCAGG TGTTGGAACA CTCATCTACA TGATTGCTAC 180 

TGGTTTTAAA GTTCCAGTTT ATCTAGGTTC TTCATTTGCC TTTATCACAG CTATGTCACT 240 

GGCTATGAAA GAAATGGGGG GGGATGTATC TGCTGCCCAA ACAGGGGTTA TCTTGACTGG 300 

TTTGGTCTAT GTCC T TGTTG CTACCAGCAT CCGATTTGTA GGAACAAAAT GGATTGATAA 360 

ACTCTTGCCA CCAATCATTA TCGGTCCTAT GATCATCGTT ATCGGTCTTG GACTTGCAGG 420 

TTCAGCTGTT ACCAATGCAG GTCTTGTAGC AGACGGAAAT TGGAAAAATG CTCTGGTAGC 480 

CGTTGTTACT TTCCTAATTG CTGCCTTTAT CAATACAAAA GGAAAAGGCT TCCTACGAAT 540 

CATTCCATTC CTCTTTGCCA TTATCGGTGG TTACCTTTTC GCACTAACTC TTGGCTTGGT 600 

TGACTTTACA CCAGTTCTTA AAGCCAACTG CTTCGAAATT CCTGGTTTCT ACTTGCCATT 660 

TAGCACAGGT GGTGCCTTTA AAGAGTACAA TCTTTACTTT GGTCCAGAAG CCATCGCTAT 720 
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CTTGCCAATC GCTATCGTAA CAATTTCTGA ACATATCGGA GACCATACTG TTTTGGGTCA 780 

AATCTGTGGT CGTCAATTCT TAAAAGAACC AGGTCTTCAC CGTACTCTTC TTGGTGACGG 840 

TATCGCAACT TCTGTTTCTG CCTTCCTTGG TGGACCAGCC AATACAACTT ACGGAGAAAA 900 

TACAGGGGTT ATCGGTATGA CTCGTATCGC TTCTGTCTCA GTTATCCGTA ACGCTGCCTT 960 

CATCGCGATT GCCCTCAGCT TCCTTGGTAA ATTCACTGCC TTGATTTCAA CTATTCCAAA 1020 

CGCTGTACTT GGTGGTATGT CAATCCTTCT CTATGGGGTT ATCGCCAGCA ATGGTTTGAA 1080 

AGTCTTGATT AAAGAACGTG TTGATTTCGC TCAAATGCGA AACCTCATCA TCGCAAGTGC 1140 

TATGTTGGTT CTTGGACTTG GAGGAGCTAT CCTTAAACTT GGTCCAGT7A CACTTTCAGG 1200 

TACTGCCCTT TCAGCCATGA CAGGAATCAT CTTGAACTTG ATCTTGCCAT ACGAAAATAA 12 60 

AGACTAAGAG TCTAAATACA CCTAATCCAC TCAGACAGCT GAGTGGATTT TTCGTATACC 1320 

ATAATAAAAG TGTCTTAACA AAATTATTAA AATCAAAAAA CGTATAATAT CAGATATTCT 1380 

AAAACCTTGA TACTGTACGT TTTATCATAG AAATTTTTAC TTTATTTTCT CATCAAATGA 1440 

GATTTGCATC AATCTCTTGT CTTACTTGCG TTTCTTCTTC GCTTTCTTCA TTTTGTTAGC 1500 

CATACGTTTC ATGGACTGTT TCATGGCAAA TTCACCAATT TTACCTTTCA AACCGCCACC 1560 

AAACATCTGG CTCATATCTG GCATTCCTGC TCCTCCGAGA GCTGATAAGT CAGGCATACC 1620 

GCCTTGTCCC ATCATTCCTT CAAGGGCAGA CATATCCATT CCTCCCATAT TTGGCATATT 1680 

TTTAGGAAGG TTATTTGGAT TAATCCCCAT TTGCTTCATC ATTTTATTCA TATCCCCAGA 1740 

CATAACACCC TGCATGAGCT GTTTAGCCTG GTTAAAGTCC TTGATGAATT TATTGACTTC 1800 

GACGAATGTA TTTCCAGAAC CAGCAGCAAT ACGACGGCGA CGGCTTGGAT TTAACAAATC I860 

TGGGTTTTCA CGCTCTTCAG GTGTCATCGA AGACACAATG GCACGTTTAC GAGCAATCTG 1920 

GCGTTCATCC ACCTTCATGT TTTGAAGGGC TGGATTGTTG GCCATACCTG GAATCATCTT 1980 

GAGCAAGTCT TCCATCGGCC CCATATTTTG CACCTGATCT AATTGATCGA TGAAATCATT 2040 

AAAATCAAAG GTGTTTTCGC GCATCTTCTC AGCCATTTCA AGGGCTTTTT GTTCATCCTA 2100 

TTCCTGAGAA GCTTTCTCAA TCAAAGTGAG CATATCCCCC ATACCAAGCA TACGGCTAGA 2160 

CATGCGGTCT GGGTGGAAGG TTTCAATGTC CGTAATCTTT TCACCTGTAC CAGTGAACTT 2220 

GATTGGTTTT CCAGTAATGT GACGAACAGA CAGAGCAGCA CCACCACGAG TATCGCCATC 2280 

AATCTTGGTA AGGATGACCC CAGTCACTTC CAACTGAGCA TTAAACTCAC GCGCAACATT 234 0 

GGCTGCTTCC TGACCAATCA TAGCATCAAC GACAAGCAAG ATTTCATTTG GTTGAGCCAA 2400 

TGCTTTCACA TCACGAAGCT CATTCATGAG GAGCTCATCA ATCTGCAAAC GACCCGCAGT 2460 

ATCAATCAAG ACATAGTCGT TATGATTAGT TTGGGCTTGC TCCAAACCTT GACGTACAAT 2520 
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CTCAACAGCT 


GGTACTTCTG 


TTCCAAGTGC 


AAAGACAGGC 


ACATCAATCT 


GTTGTCCCAA 


2580 


GGTCTTAAGC 


TGGTCAATGG 


CAGCTGGACG 


ATAAATATCC 


GCCGCAATCA 


TCAAAGGACG 


2640 


AGCATTTTCT 


TCTTTCTTGA 


GTTTGTTGGC 


CAATTTACCA 


GCAAAGGTTG 


TTTTACCAGC 


2700 


CCCTTGTAAA 


CCAACCATCA 


TGATGATGGT 


TGGAATCTTA 


GGTGACTTGA 


TAATTTCTGC 


2760 


CGTATCAGAA 


CCTAAAACGG 


CTGTCAATTC 


CTCATCAACG 


ATTTTAATAA 


TCTGTTGCGC 


2820 


AGGATTAAGT 


GTATCAATGA 


CCTCATGCCC 


GACTGCACGC 


TCACGAACTT 


TCTTGATAAA 


2880 


GTCCTTTACA 


ACAGGCAAGG 


CAACGTCGGC 


CTCGAGCAAG 


GCCAAGCGAA 


TTTCTTTGGT 


2940 


TGCCTCTTGG 


ACATCAGATT 


CAGAGATTTT 


TCCTTTTTTA 


CGTAGATTTT 


TAAAGACGTT 


3000 


CTGCAAACGT 


TCTGTTAAAC 


TTTCAAATGC 


CATTTTTCTT 


CCTCTTATTC 


TCTATTATCA 


3060 


ATGCTTGTTA 


AAATTTCTAT 


CTGCTCCTGC 


AGAAAGTCAT 


CCTTGGGATA 


GCGCTCCAAA 


3120 


ATCTGATCAA 


AAATCTGACT 


GCGGACAATA 


TAGTCCGAGT 


ACATGTGCAA 


TTTCATCTCA 


3180 


TAATCTTCCA 


GAATCTTTTC 


TGTTCGCTTG 


ATATTCTCAT 


AGACAGCCTC 


ACGACTGACA 


3240 


CCGAACTCCT 


CGGCAATTTC 


AGCAAGGCTG 


TAATCATCAG 


CGTAGTAGAG 


CTCGATATAA 


3300 


TTCATTTGCT 


TATCTGTCAA 


AAGCGCCGCA 


TAAAATTCAA 


AGAGCGCATT 


CATACGATTG 


3360 


GTTTTrrCGA 


TTTCCATAAC 


TTTTATTATA 


CCAAAAATTA 


GCCTAATCTA 


CCACACTAGG 


3420 


AAGCCGATCC 


AAGAAGATAG 


ATAGCTAAAT 


TTGAAAAAGA 


CATGAGCCTA 


GCCCCAAGTA 


3480 


ATTTCCAATT 


GATAGCTGGC 


AAACGGATGT 


CCCTCTTGAT 


TTTGTAGTTG 


ATAATCTAGT 


3540 


TCAATCTTTT 


GCCTATCAAC 


TTGATAATCG 


CTCGTTTGGA 


TGATAAACTC 


CTGCATGCCC 


3600 


ATAGGTGTAG 


GAATATAGGC 


TAAACTATCG 


CTATCCTTTA 


GAAAGCGCAT 


AATGGTCTTG 


3660 


GG ATT AG AAA 


ATCGGCTCAT 


CACAAGTTCT 


TCACCATGAA 


ATTTAATCAC 


TACTTTTTCC 


3720 


TTTTCCTCAT 


TATAGAAAAG 


CAGGTAGCTA 


TAATCTCCTT 


TTTCATGCAC 


TTCCACATCA 


3780 


TAAAGCTGGT 


CAATCACTTC 


CAACTGCTCA 


TCAAACTGAA 


TCGTATTTCG 


CATCCGAATC 


3840 


TTCACATCAG 


GCCCTCTTTC 


TTGTCTCTTG 


TCCTACTATT 


TTACCAAAAA 


GAGCAGGATT 


3900 


TTGCTATAAT 


GGTCATATGA 


ACGAAAAAGT 


ATTCCGTGAC 


CCTGTTCACA 


ACTACATCCA 


3960 


TGTCAATAAT 


CAAATCATCT 


ATGACTTGAT 


TAATACAAAA 


GAATTTCAGC 


GTTTGCGCCG 


4020 


GATCAAACAA 


CTGGGAACTT 


CCAGTTATAC 


CTTCCACGGT 


GGAGAACACA 


GTCGCTTCTC 


4080 


TCACTGTCTA 


GGAGTCTATG 


AAATTGCACG 


ACGCATCACA 


GAGATTTTCG 


AAGAAAAATA 


4140 


TCCTGAGGAA 


TGGAATCCTG 


CCGAGTCTCT 


CTTGACCATG 


ACCGCTGCTC 


TCCTACACGA 


4200 


CCTTGGGCAT 


GGTGCCTACT 


CCCATACTTT 


TGAACATCTC 


TTTGATACAG 


ACCATGAAGC 


4260 
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CATTACTCAG GAGATTATTC AAAATCCTGA GACAGAGATT CACCAAGTCC TGCTACAAGT 4320 

GGCACCTGAT TTCCCAGAAA AGGTGGCCAG TGTCATTGAC CATACCTATC CTAATAAGCA 4380 

GGTCGTGCAG CTCATTTCTA GTCAGATTGA CGCAGATCGC ATGGACTATC TCTTGCGCGA 4440 

CTCCTATTTT ACAGGAGCAT CCTATGGGGA ATTTGACCTG ACTCGAATCC TCCGAGTCAT 4500 

TCGTCCTATC GAAAATGGTA TCGCCTTTCA GCGCAATGGC ATGCACGCCA TCGAAGACTA 4S60 

CGTCCTCAGT CGCTACCAGA TGTACATCCA GGTTTATTTC CACCCCGCAA CACGCGCCAT 4620 

GGAAGTTCTC CTACAGAATC TTCTCAAACG CGCCAAGGAA CTCTATCCTG AGGACAAGGA 4 680 

TTTCTTTGCC CGAACTTCTC CACACCTCCT GCCTTTCTTC GAAAAAAATG TGACCTTGAC 47 40 

TGACTATCTG GCTCTGGATG ATGGCGTGAT GAATACCTAC TTCCAGCTTT GGATGACCAG 4 800 

TCCTGACAAG ATTCTTGCAG ATTTATCGCA TCGCTTTGTC AACCGCAAGG TCTTTAAATC 4860 

CATTACCTTT TCACAAGAGG ACCAAGATCA ACTTACTAGC ATGAGAAAAT TGGTTGAGGA 4920 

TATCGGCTTT GATCCCGACT ACTACACTGC CATTCATAAG AACTTTGACC TCCCTTATGA 4980 

TATCTATCGT CCCGAATCTG AAAACCCACG GACAGAGATT GAGATTTTAC AAAAAAATGG 5040 

AGAACTGGCC GAACTCTCTA GCCTGTCTCC TATCGTCCAA TCCCTTGCTG GCAGTCGCCA 5100 

CGGAGATAAT CGCTTTTATT TTCCAAAAGA AATGTTGGAC CAAAACAGCA TCTTTGCAAG 5160 

CATTACCCAG CAATTTTTAC ACTTGATTGA CAACGATCAT TTTACCCCAA ATAAAAACTA 5220 

GAAGAGGAAA TTTATGAGTA TTAAACTAAT TGCCGTTGAT ATCGACGGAA CCCTTGTCAA 5280 

CAGCCAAAAG GAAATCACTC CTGAAGTTTT TTCTGCCATC CAAGATGCCA AAGAAGCTGG 5340 

TGTCAAAGTC GTGATTGCAA CTGGCCGCCC TATCGCAGGC GTTGCCAAAC TTCTAGACGA 54 00 

CTTGCAGTTG AGAGACGAGG GGGACTATGT GCTAACCTTC AACGGTGCCC TTGTCCAAGA 54 60 

AACTGCTACA GGACATGAGA TTATCAGCGA ATCCTTGACT TATGAGGATT ATCTAGATAT 5520 

GGAATTCCTC AGTCGCAAGC TCGGTGTCCA CATGCATGCC ATTACCAAGG ACGGTATCTA 5580 

TACTGCAAAT CGCAATATCG GAAAATACAC TGTACACGAA TCAACCCTCG TCAGCATCCC 5640 

TATCTTCTAC CGTACCCCTG AAGAAATGGC TGGCAAAGAA ATTGTTAAAT GTATGTTTAT 5700 

CGATGAACCA GAAATTCTCG ATGCTGCGAT TGAAAAAATT CCAGCAGAAT TTTACGAGCG 5760 

CTACTCCATC AACAAATCTG CTCCTTTCTA CCTCGAACTC CTTAAAAAGA ATGTAGACAA 5820 

GGGTTCAGCC ATTACTCACT TGGCTGAAAA ACTCGGATTG ACCAAAGATG AAACCATGGC 5880 

AATCGGTGAT GAAGAAAATG ACCGTGCCAT GCTGGAAGTC GTTGGAAACC CCGTTGTCAT 5940 

GCAAAATGGA AATCCAGAAA TCAAAAAAAT CGCCAAATAC ATCACCAAAA CAAATGACGA 6000 

ATCCGGCGTT GCCCATGCCA TCCGAACATG GGTACTGTAA AAGTATCATT TTTCAATAAG 6060 
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AATTGATTAG CAATAAAATC CAATGAATTT TTTTAGCAAA CTATTTAATT TAAAACAAAA 6120 

TAATCATAAT AGAGACACAA ATTCTGATTG TAACAATTTT TACCTAAACG AATTAGAATG 6180 

TGGCCTTACT CCTGGGCAAC TCATACTCAT AGATTGGACT CAAAAAACAG GGAGAAATTA 6240 

TAATTTCCCA AGATATTTTA AATACTCTCT TCAAATTGAC CCTGAATCTA CACACAATCA 6300 

ATTATACAAA TTAGGATACT TCACTAAAAA TAAGACTTTA TCATATCTTA CAGTAGTAGA 6360 

ATTAAAAACT ATATTATCTA AACATAATTT AGCTACTTCT GGAAAAAAAG CAGAATTAAT 6420 

TACAAGAATA ATTAATAATG TTAACATTGA CAATTTAGAT ATTCCGTTCG AATTTAAACT 6480 

AACAAAAGAA GCACAAAATC TTATTATCGA ACATAGTGAC TATATCAAAG CATACTATGA 6540 

TAAAGACATA ACTATGGAAG ATTATTGTAA AGAAAAAAAC AATATCTCTT TTAAAGCAAC 6600 

TTTTGGTGAT ATAAAATGGA GTCTCTTAAA TAAACAAGCT CATAGCAATA CTGTATCAGG 6660 

AGATTTTGGA TGCTTATCTA ACACACGAAA GGCTCAGGGA AGACATTTGG AACAAGAAGG 6720 

TAATATTAAA CATGCTTTAA TATATTACAT AGAATCTTTG ATAATTACTA TTTCAGGATT 6780 

AGAAAACAAT TTTTCAGCCA CTGATTATCC AGTATATTAT CCCGATTCGA TACCTGACTA 6B40 

CTCACTAAAA CATATTCAAA CATTAATGGA ATCATTATCT GATGACGATT ATGATTTTGC 6900 

TTTTGATGAA GCATTATTTC GCTTCTCAAT TTTGAATGCA AATCATTTTT TATCTAAGGA 6960 

AGATATTGAC TATTTAAGAG TTAATTTACC TCGTTCCACT GCTGAAGAAA TAAACAATTA 7020 

CTTAAAGAAA TATGAATGTT ATAGTCCTTT AAATAATTTA GAACTTGACG ATTTTGAATA 7080 

AATTGACTAT ACAAACATTT ATATACTCGA TATAGTCTCA ATTTTATCTG ATGATTGCCC 7140 

AAATTTTTCA ATAATAAAAC GCATAATATT ATGGAGACAA TCCCCTATAT TATGCGTTCT 7200 

TTTAATATCA AAGACTTTTT CACAAACTTC TTTGATATCT AATTACATGC CCCCTGCAGG 7260 

AATCGAACCT GCAACTACTC CTTAGGAGGG AGTTGTTATA TCCATTGAAC TAAGGGAGCT 7320 

AGATAAAAAC TCTCCTAAAT GAGCAGAGTT TTTTAGTCGA ATTAACGACC GATTTCTTTG 7380 

ATACGAGCTG CTTTACCTTG AAGAGCACGC AAGTAGTACA ATTTCGCACG ACGTACTTTA 7440 

CCGTAACGAA CAACTTCGAT TTTTTCAACA CGTGGAGTGT GGATTGGGAA GATACGCTCA 7500 

ACACCTACAC CGTTAGAGAT TTTACGAACT GTGTAGTTTT CTGAGATTCC AGCACCTTTA 7560 

CGTGCGATAA CAACACG 7577 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4945 base pairs 

(B) TYPE: nucleic acid 
{CI STRANDEDNESS : double 



WO 98/18931 



PCT/US97/19588 



428 

(D) TOPOLOGY: linear 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CCTCGCTGAT GATTGCTGCT GTTTTATTTG CTGGTCCACC CTTGGCTGAA GAAACTGCAG 60 

TTCCTGAAAA TAGCGGAnCT AATACAGAGC TTGTTTCAGC ACAGAGTGAG CATTCGACCA 120 

ATGAAGCTGA TAAGCAGAAT GAAGGGGAAC ATGCTAGAGA AAACAAGCTA GAAAAGGCAG 180 

AAGGAGTAGC GATAGCATCT GAAACTGCTT CGCCAGCAAG CAATGAAGCT GCAACTACTG 2 40 

AAACTGCAGA AGCAGCTAGC GCAGCTAAAC CAGAGGAAAA AGCAAGTGAG GTGGTTGCAG 300 

AAACACCATC TGCAGAAGCA AAACCTAAGT CTGACAAGGA AACAGAAGCA AAGCCCGAAG 360 

CAACTAACCA AGGGGATGAG TCTAAACCAG CAGCAGAAGC TAATAAGACT GAAAAAGAAG 420 

TCCAGCCAGA TGTCCCTAAA AATACAGAAA AAACATTAAA ACCAAAGGAA ATCAAATTTA 480 

ATTCTTGGGA AGAATTGTTA AAATGGGAAC CAGGTGCTCG TGAAGATGAT GCTATTAACC 540 

GCGGATCTGT TGTCCTCGCT TCACGTCGGA CAGGTCATTT AGTCAATGAA AAAGCTAGCA 600 

AGG AAGC AAA AGTTCAAGCC TTATCAAACA CCAATTCTAA AGCAAAAGAC CATCCTTCTG 660 

TTGGTGGAGA AGAGTTCAAG GCCTATGCTT TTGACTATTG GCAATATCTA GATTCAATGG 720 

TCTTCTGGGA AGGTCTCGTA CCAACTCCTG ACGTTATTGA TGCAGGTCAC CGTAACGGGG 780 

TTCCTGTATA CGGTACACTC TTCTTCAACT GGTCTAATAG TATTGCAGAT CAAGAAAGAT 840 

TTGCTGAAGC TTTGAAGCAA GACGCAGATG GTAGCTTCCC AATTGCCCGT AAATTGGTAG 900 

ACATGGCCAA GTATTATGGC TATGATGGCT ATTTCATCAA CCAAGAAACA ACTCGAGATT 960 

TGGTTAAACC TCTTGCAGAA AAGATGCGCC AGTTTATGCT CTATAGCAAG GAATATGCTG 1020 

CTAAGGTAAA CCATCCAATC AAGTATTCTT GGTACGATGC CATGACCTAT AACTATGGAC 1080 

GTTATCATCA AGATGGTTTG GGAGAATACA ACTACCAATT CATGCAACCA GAAGGAGATA 1140 

AGGTTCCGGC AGATAACTTC TTTGCTAACT TTAACTGGGA TAAGGCTAAA AATGATTACA 1200 

CTATTGCAAC TGCCAACTGG ATTGGTCGTA ATCCTTATGA TGTATTTGCA GGTTTGGAAT 1260 

TGCAACAGGG TGGTTCCTAC AAGACAAAGG TTAAGTGGAA TGACATTTTA GACGAAAATG 1320 

GGAAATTGCG CCTTTCTCTT GGTTTATTTG CCCCAGATAC CATTACAAGT TTAGGAAAAA 13 80 

CTGGTGAACA TTATCATAAA AATGAAGATA TCTTCTTTAC AGGTTATCAA GGAGACCCTA 1440 

CTGGCCAAAA ACCAGGTGAC AAAGATTGGT ATGGTATTGC TAACCTAGTT GCGGACCGTA 1500 

CGCCAGCGGT AGGTAATACT TTTACTACTT CTTTTAATAC AGGTCATGGT AAAAAATGGT 1560 

TCGTAGATGG TAAGGTTTCT AAGGATTCTG AGTGGAATTA TCGTTCAGTA TCAGGTGTTC 1620 
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TTCCAACATG 


GCGCTGGTGG 


CAGACTTCAA 


CAGGGGAAAA 


ACTTCGTGCA 


GAATATGATT 


1680 


TTACAGATGC 


CTATAATGGC 


GGAAATTCCC 


TTAAATTCTC 


TGGTGATGTA 


GCCGGTAAGA 


1740 


CAGATCAGGA 


TGTGAGACTT 


TATTCTACTA 


AGTTAGAAGT 


AACTGAGAAG 


ACCAAACTTC 


1800 


GTGTTGCCCA 


CAAGGGAGGA 


AAAGGTTCTA 


AAGTTTATAT 


GGCATTCTCT 


ACAACTCCAG 


1860 


ACTACAAATT 


CGATGATGCA 


GATGCATGGA 


AAGAGCTAAC 


CCTTTCTGAC 


AACTGGACAA 


1920 


ATGAAGAATT 


TGATCTTAGC 


TCACTAGCGG 


GTAAAACCAT 


CTATGCAGTC 


AAACTATTTT 


1980 


TCGAGCATGA 


AGGTGCTGTA 


AAAGATTATC 


AGTTTAACCT 


AGGACAATTA 


ACTATCTCGG 


2040 


ACAATCACCA 


AGAGCCACAA 


TCGCCGACAA 


GCTTTTCTGT 


AGTGAAACAA 


TCTCTTAAAA 


2100 


ATGCCCAAGA 


AGCGCAAGCA 


GTTGTGCAAT 


TTAAAGGCAA 


CAAGGATGCA 


GATTTCTATG 


2160 


AACTTTATGA 


AAAAGATGGA 


GACAGCTGGA 


AATTACTAAC 


TGGCTCATCT 


TCTACAACTA 


2220 


TTTATCTACC 


AAAAGTTAGC 


CGCTCAGCAA 


GTGCTCAGGG 


TACAACTCAA 


GAACTGAAGG 


2280 


TTGTAGCAGT 


CGGTAAAAAT 


GGAGTTCGTT 


CAGAAGCTGC 


AACCACAACC 


TTTGATTGGG 


2340 


GTATGACTGT 


AAAAGATACC 


AGCCTACCAA 


AACCACTAGC 


TGAAAATATC 


GTTCCAGGTG 


2400 


CAACAGTTAT 


TGATAGTACT 


TTCCCTAAGA 


CTGAAGGTGG 


AGAAGGTATT 


GAAGGTATGT 


2460 


TGAACGGTAC 


CATTACTAGC 


TTGTCAGATA 


AATGGTCTTC 


AGCTCAGTTG 


ACTGGTAGTG 


2520 


TGGATATTCG 


TTTGACCAAG 


CCACGTACCG 


TTCTTAGATG 


GGTCATCGAT 


CATGCAGGAG 


2580 


CTGGTCGTGA 


GTCTGTTAAC 


GATGGCTTGA 


TGAACACTAA 


AGACTTTGAC 


CTTTATTATA 


2640 


AAGATGCAGA 


TGGTGAGTGC 


AAGCTAGCTA 


AGGAAGTCCG 


TGGTAACAAA 


GCACACGTGA 


2700 


CAGATATCAC 


TCTTGATAAA 


CCAATCACTG 


CTCAAGACTG 


GCGCTTGAAT 


GTTGTCACTT 


2760 


CTGACAATGG 


AACTCCATGG 


AAGGCTATTC 


GTATCTATAA 


CTGGAAAATG 


TATGAAAAGC 


2820 


TTGATACTGA 


GAGTGTCAAT 


ATTCCGATGG 


CCAAGGCTGC 


AGCCCGTTCT 


CTAGGCAATA 


2880 


ACAAGGTACA 


AGTTGGCTTT 


GCAGATGTAC 


CGCCTGGAGC 


AACTATTACC 


GTTTATGATA 


2940 


ATCCAAATTC 


TCAAACTCCG 


CTCGCAACCT 


TGAAGAGCGA 


AGTTGGAGGA 


GACCTAGCAA 


3000 


GTGCACCATT 


GGATTTGACA 


AATCAATCTG 


GTCTTCTTTA 


TTATCGTACC 


CAGTTGCCAG 


3060 


GCAAGGAAAT 


TAGTAATGTC 


CTAGCAGTTT 


CCGTTCCAAA 


AGATGACAGA 


AGAATCAAGT 


3120 


CAGTCAGCCT 


AGAAACAGGA 


CCTAAGAAAA 


CAAGCTACGC 


CGAAGGGGAG 


GATTTGGACC 


3180 


TTAGAGGTGG 


TGTTCTTCGA 


GTTCAGTATG 


AAGGAGGAAC 


TGAGGACGAA 


CTCATTCGCC 


3240 


TAACTCACGC 


AGGTGTATCA 


GTATCAGGTT 


TTGATACGCA 


TCATAAGGGA 


GAACAGAATC 


3300 


TTACTCTCCA 


ATATTTGGGA 


CAACCGGTAA 


ATGCTAATTT 


GTCAGTGACT 


GTCACTGGCC 


3360 
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AAGACGAAGC AAGTCCGAAA ACTATTTTGG GAATTGAAGT AAG*P€-1GAA CCGAAAAAAG 34 20 

ATTACCTAGT TGGTGATAGC TTAGACTTGT CTGAAGGACG CTTTGCAGTG GCTTATACCA 34 80 

ATGACACCAT GGAAGAACAT TCCTTTACTG ATGAGGGAGT TGAAATTTCT GGTTACGATG 3540 

CTCAAAAGAC TGGTCGTCAA ACCTTGACGC TTCATTACCA AGGCCATGAA GTTAGCTTTG 3600 

ATGTTTTGGT ATCTCCAAAA GCAGCATTGA ACGATGAGTA CCTCAAACAA AAATTAGCAG 3660 

AAGTTGAAGC TGCTAAGAAC AAGGTGGTCT ATAACTTTGC TTCATCAGAA GTAAAAGAAG 3720 

CCTTCTTGAA AGCAATTGAA GCGGCCGAAC AAGTGTTGAA AGACCATGAA ACTAGCACCC 3780 

AAGATCAAGT CAATGACCGA CTTAATAAAT TGACAGAAGC TCATAAAGCT CTGAATGGTC 3840 

AAGAGAAATT TACGGAAGAA AAGACAGAGC TTGATCGCTT AACAGGTGAG GTTCAAGAAC 3900 

TCTTGGCTGC CAAACCAAAC CATCCTTCAG GTTCTGCCCT AGCTCCGCTT CTTGAGAAAA 3960 

ACAAGGCCTT GGTTGAAAAA GTAGATTTGA GTCCAGAAGA GCTTACAACA GCGAAACAGA 4020 

GTCTAAAAGA TCTGGTTGCT TTATTGAAAG AAGACAAGCC ACCAGTCTTT TCTGATAGTA 4080 

AAACAGGTGT TGAAGTACAC TTCTCAAATA AAGAGAAGAC TGTCATCAAG GGTTTGAAAG 4140 

TAGAGCGTGT TCAAGCAAGT GCTGAAGAGA AGAAATACTT TGCTGGAGAA GATGCTCATG 4200 

TCTTTCAAAT AGAAGGTTTG GATGAAAAAG GTCAAGATGT TGATCTCTCT TATGCTTCTA 4260 

TTGTGAAAAT CCCAATTGAA AAAGATAAGA AAGTTAAGAA AGTATTTTTC TTACCTGAAG 43 20 

GCAAAGAGGC AGTAGAATTG GCTTTTGAAC AAACGGATAG TCATGTTATC TTTACAGCAC 4380 

CTCACTTTAC TCATTATGCC TTTGTTTATG AATCTGCTGA AAAACCACAA CCTGCTAAAC 4440 

CAGCACCACA AAACACAGTC CTTCCAAAAC CTACTTATCA ACCGACTTCT GATCAACAAA 4500 

AGGCTCCTAA ATTGGAAGTT CAAGAGGAAA AGCTTGCCTT TCATCGTCAA GAGCATGAAA 4560 

ATACTGAGAT GCTAGTTGGG GAACAACGAG TCATCATACA GGGACGAGAT GGACTGTTAA 4620 

GACATGTCTT TGAAGTTGAT GAAAACGGTC AGCGTCGTCT TCGTTCAACA GAAGTCATCC 4 680 

AAGAAGCGAT TCCAGAAATT GTTGAAATTG GAACAAAAGT AAAAACAGTA CCAGCAGTAG 4740 

TAGCTACACA GGAAAAACCA GCTCAAAATA CAGCAGTTAA ATCAGAAGAA GCAAGCAAAC 4800 

AATTGCCAAA TACAGGAACA GCTGATGCTA ATGAAGCCCT AATAGCAGGC TTAGCCAGCC 4860 

TTGGTCTTGC TAGTTTAGCC TTGACCTTGA GACGGAAAAG AGAAGATAAA GATTAAATAT 4920 

CGAAAAATCT TGTGAAATCT TTCCG 4945 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25002 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

GACAACTCAA GTAGCTTTTT CTTATTTTGA AAAAGGAGAT CAGAGTTTAA CTATGTCAGA 60 

AAAATCACAA TGGGGGTCGA AACTTGGTTT TATTCTAGCA TCTGCTGGCT GGCCATCGGG 120 

CTTGGTTCCG TTTGGAAGTT TCCCTACATG ACTGCTGCTA ATGGCGGTGG AGGCTTTTTA 180 

CTAATCTTTC TCATTTCCAC TATTTTAATC GGTTTCCCTC TCCTGCTGGC TGAGTTTGCC 240 

CTTGGCCGTA GTGCTGGCGT TTCCGCTATC AAAACCTTTG GAAAACTGGG CAAGAATAAC 300 

AAGTACAACT TTATCGGTTG GATTGGCGCC TTTGCCCTCT TTATCCTCTT ATCTTTTTAC 360 

AGTGTTATCG GAGGATGGAT TCTAGTCTAT CTAGGTATTG AGTTTGGGAA ATTGTTCCAA 420 

CTTGGTGGAA CGGGTGATTA TGCTCAGTTA TTTACTTCAA TCATTTCAAA TCCAGCCATT 480 

GCCCTAGGAG CTCAAGCGGC CTTTATCCTA TTGAATATCT TCATTGTATC ACGTGGGGTT 540 

CAAAAAGGGA TTGAAAGAGC TTCGAAAGTC ATGATGCCCC TGCTCTTTAT CGTCTTTGTT 600 

TTTATCATCG GTCGCTCTCT CACTTTGCCA AATGCCATGG AAGGGGTTCT TTACTTCCTC 660 

AAACCAGACT TTTCAAAACT GACTAGCACT GGTCTCCTCT ATGCTCTGGG ACAATCTTTC 720 

TTTGCCCTCT CACTAGGGGT TACAGTCATG TTGACCTATG CTTCTTACTT AGACAAGAAA 780 

ACCAATCTAG TCCAGTCAGG AATCTCCATC GTAGCCATGA ATATCTCGAT ATCCATCATG 840 

GCAGGTCTAG CCATTTTCCA AGCTCGATCC CCCTTCAATA TCCAGTCTGA AGGGGGACCC 900 

AGCCTGCTCT TTATCGTCTT GCCTCAACTC TTTGACAAGA TGCCTTTTGG AACCATTTTC 960 

TACCTCCTCT TCCTCTTGCT CTTCCTTTTT GCGACAGTCA CTTTTTCTGT CGTGATGCTC 1020 

GAAATCAATG TAGACAATAT CACCAACCAG GATAACAGCA AACGTGCCAA ATGGAGTGTT 1080 

ATTTTAGGAA TTTTGACCTT TGTCTTTGGC ATTCCTTCAG CCCTATCTTA CGGTGTCATG 1140 

GCGGATGTTC ACATTTTTGG TAAGACCTTC TTTGACGCTA TGGACTTCTT GGTTTCCAAT 12 00 

CTCCTCATGC CATTTGGAGC TCTCTACCTT TCACTTTTTA CAGGCTATAT CTTTAAAAAG 1260 

GCTCTTGCAA TGGAGGAACT CCATCTCGAT GAAAGAGCAT GGAAACAAGG ACTGTTCCAA 1320 

GTCTGGCTCT TCCTTCTTCG TTTCTTCGTT TCGTCATTCC AATCATCATC ATTGTGGTCT 1380 

TCATTGCCCA ATTTATGTAA TCAAAAAGGA CTTGAGTAGT GAACTCAGGC CCTTTCTTTT 1440 

TATGGATGGC TAACAATCAA TTCCAAACCT TGCCCTTCCA GAGTCCAAGC TTCAACATCA 1500 

CTTGGTAGGA TAAAGTGGCT GCCTTTTTGA ATTGGATAAT TTTTCCCCTC AACAGTTAGC 1560 
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TGACCTTGAC CAGCCAAGAC ACTCAATAAG CTGTAGTCAC CTGTCTTTTC AAAGTCAACT 1620 

TTTCCAGTAA TTTCCCACTT GTAAACTGCG AAGAAATCAT TAGATACAAG GAGAGTGGAA 1680 

CGCAAATCAT CTGCTTTAAC AGTTACAGGA CGGCTATTTG CTGGCTCACC AATGTTCAAG 1740 

ACATCGATGG ATTTTTCAAG ATGAAGTTCA CGCAAGTTGC CTTTGTCATC CTTGCGGTCA 1800 

AAGTCATAGA CGCGATAGGT GGTATCGCTA GACTGCTGGG TTTCAAGGAT TAAGATACCC 1860 

GCCCCGATAG CGTGCATAGT CCCGCTTGGT ACATAGAAGA AATCTCCAGC CTTAACAGGG 192 0 

ACTTTGGTCA ACAAGTCATC CCAGTTCTTG TCCTCGATTT GCTGGCGGAG TTCTTCTTTT 1980 

GACTTGGCAT TGTGACCGTA GATAATCTCT GAACCTTCAT CCGCTGCGAT AATGTACCAG 2040 

CATTCTGTTT TTCCGAGTTC GCCTTCATGC TCGAGTCCAT AAGCATCGTC TGGGTGAACT 2100 

TGGACACTGA GCCAGTCGTT GGCATCGAGG. AtCTTGGTCA AAAGTGGAAA TACAGGTTCT 2160 

GGACGATTGC CAAATAATTC ACGGTGTTCC GCATACAAAG TAGCAAGATC TGTTCCCTCG 2220 

TAACGACCAT TGGCAACTTT AGAGACTCCA TTTGGATCGG CTGAGATGGC CCAATATTCT 2280 

CCGATTTTTT CACTTGGGAT GTCGTAGCCA AACTCATCAC GTAGCTTGGC TCCACCCCAG 2340 

ATTTTTTCTT GCATAACTGA TTGTAAAAAT AATGGTTCTG ACATGTCGAT CTCCTGTCTG 2 400 

ATTTTTCTCC CCTCATTATA GCAAAAAAAG AGTTCGAATT GAACTCTTTT TTACATCTTA 24 60 

TAAAGCAGGG AGAAGATTTT ATAAAAATAG TAAACAAATG TGCTCTACCC GATGCTTGCA 2520 

CCATTGCTAT AAATGACATC CTTGTACCAA TAGAAGGACT TCTTCTTCCT ACGTTTGAGA 2580 

GCTCCGTTTC CTACATTATC TCGATCTACA TAGATAAAGC CATAGCGCTT ATTCATTTCC 2640 

CCTGTGCCAG CTGAAACCGG ATCGATACAG CCCCAAGTCG TATAACCAAG CAAGTCAACC 2700 

CCGTCTTGGT AAATGGCATC TCGCATGGCC TTGATGTGGG CCTCTAAGTA AGTAATCCGA 27 60 

TAGTCATCTG CTACATAACC ATTCTCATCC GGTGTATCCA TAGCACCGAG TCCATTTTCT 2820 

ACGATAATAC TAAACTAAAA TCAAAAAGCA TTATATAATA GTGATATGAA ATCAACTAAA 2880 

GAAGAAATCC AAACCATCAA AACACTTTTA AAAGACTCTC GTACAGCTAA ATATCATAAA 2940 

CGCCTTCAAA TCGTTCTATA GTAAAATGAA ATAAGAACAG TACAAATCGA TCAGGACAGT 3000 

CAAATCGATT TCTAACAATG TTTTAGAAGT AGGGGTGTAC TATTCTAGTT TCAATCTACT 3060 

ATATTTCGTC TGATGGGCAA ATCTTATAAA GAG ATT AT AG AACTTTTATA GTAGTTTGAA 3120 

ATAAGATGTG AACAACTCTA TCAGGAAAGT CAAATTAATT TATAGAAATA TTTTAGCAGC 3180 

CAAGGTGTAC TGTTATAGAT TCAATACACT ATAGACTGTA ATCAAACAAC GATTTGGCGA 3240 

AATGTAAAAA AATATGAGGA GTTCGGACTC GACTCTCTCC TTCAAGAAAC ACGTGGTGGT 3300 

CGTAACCATG CATATATGAC AGTTGAGGAA GAGAAAGCCT TTCTTGCCCG CCATTTGAAG 3360 
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GCTACAGAGG 


CAGGAGAATT 


TGTTACAATT 


GATGCCTTAT 


TTCAGGCTTA 


TAAAAAGGAG 


3420 


TTAGGTCGTT 


CCTACACACG 


TGATGCCTTC 


TATCAACTGT 


TGAAGCGCCA 


TGGTTGGCGA 


3480 


AATATTACGC 


CACGTCCAGA 


ACATCCTAAG 


AAAGCAGACG 


CTCAAACCAT 


TGTTGCGTCT 


3540 


AAAAATAAAA 


TCTCAATCCA 


AGAAGGCAAG 


AAAGCGTTTT 


AAATATAGTA 


GACGTTTTCG 


360O 


TAAGGTTTGC 


TTGATGTACC 


AAGCTGAAGC 


TGGTTTCGGT 


AGAATCAGTA 


AACTGGGATC 


3660 


TTGTTGGGCT 


CCAATAGGAG 


TAGGTCCACA 


TATCCATAGT 


CACTATATAC 


GAGAATTTCG 


3720 


CTATTGTTAT 


GGAGCTGTTG 


ATGCCTATAC 


ACGCGAATCA 


TTTTTCTTAA 


TACCTGGTAG 


3780 


ATGTAATACT 


GAGTGGATGA 


ACGCCTTTTT 


AGAAGAGCTT 


TCACAACCTT 


ATCCTTTTAC 


3840 


TCGTTATGGA 


CAATGCTATA 


TGGCATAAAT 


CAAGTACCTT 


AAAGATTCCG 


ACTAATATTG 


3900 


GTTTTGCATT 


TATTCCTCCA 


TACACACCAG 


AGATGAACCC 


CATTGAACAA 


GTGTGGAAAG 


3960 


AGATTCGTAA 


ACGTGGATTT 


AAGAATAAAG 


CCTTTCGAAT 


TTTGGAAGAT 


GTCATGAATC 


4020 


AACTCCAAGA 


TGTCATACAA 


GGATTGGAGA 


AGGAGGTGAT 


AAAGTCCATC 


GTTAATCGGA 


4080 


GATGGACTAG 


AATGCTTTTT 


GAAAGCAGAT 


GAGTATTATA 


TGCAATTTCT 


TTATATAAAA 


4140 


AGACCGGATT 


GCTCCGATCT 


TTCAATAGTT 


CATATTCTCA 


ATTTCTATTT 


TAAAAATAGC 


4200 


TAAGGTTAAC 


GTCAAATGAC 


TACGCGACCT 


ATTTCATACG 


ATAAAAATCA 


AGCACTAGAC 


4260 


CAGCAGGTCC 


TTGAACTAAT 


AAGGACTCTG 


TTCCCCAATC 


GGTTACAGTT 


GGTCCGTGTA 


4320 


AAACCTTTAT 


ACCAAGCTCG 


TTCAACCGTT 


TGTAGTTCTG 


GTCTACATCC 


TCAACCTCGA 


4380 


TATGAATAAT 


GATTCCTGAC 


TGAAAGTTTT 


CCAAAGGAAC 


CAAATGATTT 


TGTGACAACA 


4440 


TAAGGCAGTG 


ACTACCAATC 


GTAAACTGAG 


CAAAACCATC 


ATTAGCATAA 


TCTGCCTTTT 


4500 


TATCCAAGAT 


ATGCTCCAAG 


TCAGCACAGA 


CTTGGGGAAC 


ATTTCAAACG 


ATAATATCTA 


4560 


ATTGATTTAA 


ATTCATTTAC 


TCTCCTCCAT 


AAAAAGACCG 


GATTGCTCCG 


ATCTTTTAAA 


4620 


GTTCTGCTCT 


ATCAAAATCA 


AAGAATAAAG 


TCTACAAGTT 


TCATATTTGA 


TTTTCGGCGA 


4680 


GAGGAATTAT 


TTAATTGCGC 


GTGATTGCAA 


TCCTTCTTCT 


TCCAAGAAGA 


GACGGAATGG 


4740 


TACGAGTTCT 


TCTGCTTCGT 


ATTTTTCCTT 


GAAGGCTTTC 


ATAGCTTCTT 


CTGAGTGAAG 


4800 


TTTTGGATCC 


AATTCAAGTA 


CTTCTACTGG 


AAGTGGACGG 


TGTTGAGTGA 


TGCGAGCATC 


4860 


GATGACAACA 


GTTTTACCTT 


crrrGTTCAA 


TTTAACAGCT 


TCTGCAACAA 


CTGCATCGAT 


4920 


GTCTTCGATA 


CGGTCAACTG 


TGAATCCAAC 


AGCTCCTTGA 


GCTTCCGCAA 


TTTTAGCGTA 


4980 


GTCAGCGTTT 


GTGAAGTCTA 


CACCAAACAA 


GTGTTTGTTT 


GTATCTTCGT 


ATTTGTTCTT 


5040 


GATGAAGCCG 


TACTCAGCAT 


TTGAGAAGAC 


AAGGTTGATA 


ACTGGAAGGT 


CGTATTGAAC 


5100 
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GTTTGTGATA ACGTCTGGGT AGCACATGTT GAATGCTCCG TCACCCATGA TGTTCCATAC 5160 

TTGGCGATCT GGATTGTCTT TCTTAGCAGC GATACCACCA GGAAGGGCAA TACCCATTGT 5220 

CGCAAAGAGT GGAGATCTAC GCCACATGTT CTTAGGTGTC ATCTGAAGGT GACGAGTAGA 5280 

TGTTTGAGTA GTGTTACCTA CGTCGATTGA GTAGATAGCG TCTTGATCAG CATGTTTGTT 5340 

GATTGCATTG TAAACTTGAT ACAATTGCAA TTCACCCTCA GTTTTACCTT CGAGTTTGTT 5400 

CATGTAATCA CGCCAGTTTT GGTTGTTCTT AACGTTTGCA CGCCACCATG GAGTTGATTC 5460 

AACTGGGTTT ACTTTGTCAA GGATAGCTTT AGCTGCTTGA CCAGCATCAC CAAGGATTGA 5520 

AGCGTCAAGG GCATGACGTT TACCAAGTTT GTAAGGGTCG ATATCGACTT GGATGAATTT 5580 

TTCAGTGTTC TTGAATGCTT CGTAAACTTC AGCAAATGGG AAGTTTGAAC ' CAAGGAAAAG 5640 

AACTGTGTCT GCTTCAAAGA CCACTTCGTT GGCTGGTTTC CAACCAACAC GGTAAGCAGA 5700 

ACCTGTCAAA CCTTCATAGT TCCATTCGAA AGCTTCAAAG TTTTTACCAG TTGTGATGAT 5760 

TGGTGCTTTG ATTTTACGTG ACAATTCAGT AATCACTTCA CCAGCTTTAA CACCACCAAA 5820 

TCCAGCATAG ATAACTGGGC GTTCAGCATT GTTCAAGATT TCAACAGCTT TGTCGATTTC 58 80 

AACTTCGTTC AAAGCACCAG CGATGAATGA GCGTTCGTAT GAACCTGAAC CGTAGTATGA 5940 

GTTTTCATCG ATTTCTTGGA AACCGAAGTT TACTGGAATT TCAACAACAG CTGGACCTTT 6000 

TTTAGAAACT GCAGCACGGC AGGCTTCGTC AATTACTTTT GGCAATTGCT CAGCGTAAGC 6060 

TACACGTTTG TTGTAAACAG CGATACCGTT GTACATTGGG TTTTGGTTAA GCTCTTGGAA 6120 

AGCATCCATG TTCAATTCGT TAACTGGACG TCATCCAAGG ATCGCTAGGA ATGGAGTGTT 6180 

ATCCATAGCT GCATCGTAAA CACCGTTAAT CAAGTCAGTC GCACCTGGAC CACCTGAACC 6240 

AACTGCAACC CCGATTGAGC CGCCGAATTT AGCTTGCATA ACCCCTGCAA GAGCACCTGT 6300 

CTCTTCGTGG CGAACTTGTA AGAAACGGAT ATCTTTGTCT TCAGCCAAAG CGTCCATCAA 6360 

TGAGCTGAGT GTTCCTGATG GGATACCGTA GATTGTATCT ACGCCCCATG TTTTCAATAC 64 20 

GTTAAGCATT GCTGCAGATG CAGTAATTTT CCCTTGAGTC ATAATGATAA CTCTCCTTCA 64 80 

ATTTTTTTAA ACTTGGAGAA TACGATTACA TAGAATTGGA AACGTTCTCC AAATTTTTAC 6540 

TATTCCACTG TATCATATTT ATGCTGACTT TTCTAAAAAT CTGCTCAAAA CTCTCTATTC 6600 

TCTATTCTAA TACAGTTTTG AAAGTTCTGT CATTTCTGTT TTATAACAAA GAAATCTAGT 6660 

CATTACTTTT AGTCTATTTT ACTAAAATTT AACAGAAGGG AACTGGTCAG AACAGATACA 6720 

GAACTAAAGG CCATGGCTAG ACCTGCCAAT TCTGGGTTGA GAGCCAGTCC AACACCTGAA 6780 

AAGACTCCTG CTGCAATCGG AATTCCGACA ACATTGTAGA TAAAAGCCCA GAAAAGATTG 6840 

AGTAGAATTC GATGAAAGGT TTTCTTACTC ATATCAAAGG CACGAACCAC TCCTAAAAGA 6900 
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TTATTGGTTG 


TCAACACCAA 


ATCTGCTGAC 


TCGATGGCGA 


TATCTGTTCC 


AGCTCCCATA 


6960 


GCAATCCCCA 


CATCTGCTAC 


ACTAAGGGCA 


GGAGCGTCAT 


TGATACCGTC 


CCCAACAAAG 


7020 


GCTACTTTCC 


CTCACTGTTG 


CAGTTTATGG 


ATTTCATGGG 


CTTTTTCTTC 


TGGCAAGACG 


7080 


CCTGCAATGA 


CCTCTTCAAT 


TCCCATTTGA 


TCTGCAATAG 


CACGCGCCAC 


ACCAGCATTG 


7140 


TCTCCTGTCA 


GCATGACTGT 


TCGGAGACCA 


CCTTTTTTTA 


GCTGACTGAT 


GGCTAGCTTA 


7200 


GCATTTTCCT 


TAGGAATATC 


TTGCAAAGCA 


AGCAAGCCTT 


TGATTTCATT 


GTCAACAGCT 


7260 


AAGAACACAA 


CTGTCTTAGC 


TTCTPTTTCT 


ACTTCTTCTA 


GTTTATCTTG 


ATAAGTATTA 


7320 


GAAATATCCA 


TGCCATCCAG 


CATTTTAGCA 


TTTCCAAGTA 


AAACTTGTTT 


TCCATTGATT 


7380 


CGCCCTGAAA 


CACCTTTCCC 


GTGCAAGGAC 


TGAAAATTTT 


CAACAGTTTG 


AAACTCAAGT 


7440 


CCAGCTTCAC 


TCGCTCGCTT 


AACGATAGCC 


TCAGCCAGTG 


GGTGTTGAGA 


AGCATCTTCC 


7500 


AAGGAGGCTG 


CCAACCCAAA 


CACTTCTACT 


TCGTCGCCGA 


TGACATCTGT 


TACCACAGGT 


7560 


TTCCCTTCCG 


TCAAAGTCCC 


GGTCTTATCA 


AAGACAAGGG 


TTTGAACTTT 


CTGGATTTCC 


7620 


TGTAAGACAG 


TTCCATTTTT 


GAGGAGAACC 


CCCATCTTCG 


CACTACGTCC 


TGTCCCCACC 


7680 


ATAAGGGCTG 


TCGGTGTTGC 


AAGTCCCAAG 


GCACAAGGAC 


ACGCGATAAT 


CAAAACCGCC 


7740 


ACTCCGTAGA 


GAAGAGAGGA 


CACAAAGCTA 


GCTCCAAGCA 


CAACCACACT 


ATCCCTGAGC 


7800 


AAGACGAACC 


AAACCCAAAA 


GGTCATGATT 


CCTAAAATGA 


CAACTACTGG 


GACAAAAATC 


7860 


CCTGAAATCT 


TATCCGTCAA 


GTCCTGAATC 


GGCGCACCAC 


TTGTCTGAGC 


TTTCTTCACA 


7920 


AAATCCACAA 


TCTCAGCCAA 


AACAGTCTCT 


GAGCCAACTT 


TTTCTCCTCT 


AAAGACAAGC 


7980 


GTTCCACTAT 


CATTCATGCT 


TGAGCCAATG 


ACAGTATCTC 


CAACTGTCTT 


GTCCACAGGC 


3040 


AGACTCTCAC 


CTGTCACCAT 


GGATTCGTCA 


ATACTAGAGA 


CACCTTCTAC 


TACGACACCA 


8100 


TCAACAGCAA 


TCTTTTCACC 


GGGACGCACT 


CGAATCAGGT 


CGCCTACCTT 


GACTTGTTCC 


8160 


AAAGGAACTT 


GGACATAACT 


ATCATCACTC 


AAGACTTCTG 


CGCTTTTACC 


TTGCAAGTCC 


8220 


AGTAATTTCT 


CCACAGCTTG 


GGACGTATTT 


TTTCTCATTT 


TTTCCTCAAA 


AACTGCTCCC 


8280 


AAAAGAACGA 


AAAAGAGGAT 


AAATCCAGCA 


CTTTCGAAGT 


AAACAGGGAG 


ACCAGCAAAG 


8340 


AGAGCAACTA 


GCCTATAGAA 


ATAAGCCACT 


AGAGTTCCCA 


GCGCAACCAA 


GGTATCCATC 


8400 


TTGGCATTGT 


GCTTTTTAAA 


ACTGGCCCAA 


GCACTCTGGA 


TATATGGCTT 


ACCTGCAACT 


8460 


AACATAATAG 


GCGTTGTTGC 


TAGAAAGGTT 


CCCCAATGCA 


TGACTTGATG 


ACTAATGCTA 


8520 


CCTGTCAACA 


TCCCAATCAT 


GAGAATCACA 


AGAGGCACAG 


TAAAGATACT 


ACTAATCCAA 


8580 


AAACCTTGCA 


GGAGAGATAG 


AGATTTTCGA 


GTCTTCTCAA 


CGACTGTATA 


GCTTCCCTTT 


8640 
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TGCATCTTCA 


TGCCACAAGA 


AAATTCATGT 


CGCCCTAATT 


CTTGAGCCGT 


AAAACGAATG 


8700 


ACTTTCTCCT 


CATCTACGCC 


GATTGGTTCC 


AAGATACCTT 


CTTCTTCAAA 


CAGAATTTCC 


8760 


TTATAACAGT 


TTGAAGGAGT 


AGCACGATGA 


AAGGTAATCT 


CAGCTGGAAT 


TCCCTTTTGA 


8820 


AGCTGGATAT 


GGGCTGGATG 


ATAGCCTTTT 


TCAGCTCGGA 


TACGGATTTT 


TTGAATGCCA 


8880 


TTTTCTAAGC 


TTGCTTTCAC 


AATTTCTGTC 


ATAGTCTCCA 


CCTACTCTAC 


AATCATCTTG 


8940 


CCGTGCATCA 


TGTTCATACC 


ACAAGCAAAG 


CC AAACTCTC 


CAGCCTGTTC 


AGGCGTGATT 


9000 


TCCACTACAT 


ACTCTTCCCC 


CATTGGCAGG 


TTCGCATGTA 


CACCAAAATC 


TGGAAAAACA 


9060 


ATTTGATCCA 


GACATGGTGA 


AGGATCCTTG 


CGGTCAAAGA 


CAATGCGTGC 


TGCCACTGAT 


9120 


TTCTTGAGGA 


CAATCAACTC 


AGGAGTATAG 


CCTCCCATGA 


CTTCCACTCG 


AATCTCTTGG 


9180 


TATCCGTTTT 


TTTGCTGGGC 


TTTTTGTCCA 


GATTTTTCAG 


GCTTTTTGAA 


AAACCAAAAC 


9240 


AAGATAAACG 


CGATAAGGGC 


AATACAAATA 


ATGGTTACAA 


TACTATTTAA 


CATGACGTCT 


9300 


CCTTTACATA 


CAATTACATC 


TTACTTCTGT 


TACAGCACTT 


GATTTCTTCT 


CTGAAATCAC 


9360 


AGCTTCCAAG 


TCTTCCAAGT 


CAGTCTGAGT 


AAATTCACAT 


TCTACAATCA 


AGTCAGCCAA 


9420 


CAAATTCCTA 


ATCCTACGGG 


AACAAACCTT 


CTCTTTGATA 


TCTTGGACAA 


GTAAATCCCG 


9480 


ACTTTGGTCT 


AGAGTTAAAA 


GGGCTGAATA 


AACAAAGGAC 


TTGCCTTCTT 


TTTTCCGAGT 


9540 


CAAACACTCT 


TTATCAACCA 


GACGAGCCAA 


AAGTGTCTCA 


ACCGTGGACT 


TGGACCAGTC 


9600 


AAACCGCTCT 


GCCAAAACCC 


TAATCAAATC 


TGTACTGGTC 


TGCTCCCCCT 


GCATCCAAAT 


9660 


AATCTTCATG 


ACCTGCCATT 


CTCCATCTGA 


AATCTGCATT 


ACCATACCTC 


CAAAATCTAC 


9720 


ATTTGTCAAT 


TACACTCATC 


AGTATACTCT 


TAAAATCTAC 


ATTTGTCAAT 


TATAGAAATA 


9780 


ATATTTTCTT 


CGAAAAATAG 


AATTTTAATC 


ATTTGAAAAA 


CGATTTGCAG 


TCAAATATTA 


9840 


CTATATAAAC 


AATAAAAATA 


TGCTATACTA 


AAGAAAAAAG 


AAAACAACCA 


CTAGGGGTGC 


9900 


GTAAAGCTGA 


GATTAACGAC 


TGTTAGATCC 


CTCTGACTCA 


ATCTAGGTAA 


TGCTAGCTGA 


9960 
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AACAGACTAG 


10020 


CTTGTTCTTA 


AGAATACAAA 


CTTCAGTTGG 


TTGGGAGGTT 


TTAGATGACT 


TATTTACCCG 


10080 


TTGCTTTGAC 


CATTGCAGGG 


ACTGACCCTA 


GTGGTGGTGC 


TGGCATTATG 


GCAGATTTAA 


10140 


AGTCATTCCA 


AGCGAGAGAT 


GTCTATGGAA 


TGGCTGTTGT 


AACCAGTCTT 


GTCGCTCAAA 


10200 


ATACCAGAGG 


TGTTCAGCTA 


ATCGAGCACG 


TTTCTCCTCA 


AATGTTGAAA 


GCCCAATTGG 


10260 


AGAGTGTCTT 


TTCTGATATT 


CCACCTCAGG 


CTGTAAAAAC 


TGGAATGTTG 


GCTACTACTG 


10320 


AAATCATGGA 


AATCATCCAA 


CCCTATCTTA 


AAAAACTGGA 


TTGTCCCTAT 


GTCCTTGATC 


10380 


CTGTTATGGT 


TGCTACAAGT 


GGAGATGCCT 


TGATTGACTC 


AAATGCTAGA 


GACTATCTCA 


10440 
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AAACAAACTT ACTACCTCTA GCAACTATTA TTACCCCAAA TCTTCCTGAA GCAGAAGACA 10500 
TTGTTGGTTT TTCAATCCAT GACCCCGAAG ACATGCAGCG TGCTGGTCGC CTGATTTTAA 10560 
AAGAATTTGG TCCTCAGTCT CTGGTTATCA AAGGCGGACA TCTCAAAGGT GGTGCTAAAG 10620 
ATTTCCTCTT TACCAAGAAT GAACAATTTG TCTGGGAAAG CCCACGAATT CAAACCTGTC 10680 
ACACCCATGG TACTGGATGT ACCTTTGCTG CAGTCATTAC TGCTGAACTA GCCAAGGGCA 10740 
AGAGTCTTTA CCAGGCAGTT GATAAGGCCA AGGCCTTTAT CACAAAAGCT ATTCAAGATG 10800 
CCCCTCAACT CGGTCATGGT TCTGGTCCAG TCAACCATAC AACTTTTAAA GATTAAGAAA 10860 
AAAAACTCTC TAGTTCCCAC TTTAAGCGAA TTAGAGAGTT TTTATACTCT TCGAAAATCT 10920 
CTTCAAACTA CGTCAGCTTC CATCTGCAGC CTCAAAACAC TGTTTTGAGC TGACTTCGTC 10980 
AGTCTTATCT AAAACCTCAA GGCAGTACTT TGAGCAACCT GCGACTAGCT TTCTAGTTTA 11040 
CTCTTTGATT TTCATTGAGT ATTAATTAGG AAAGAATGTT ATGCAACTTT TTTAAAAAGG 11100 
CTTGCGTTTT TGCCTCAATA TCTTCTGCTT GCATCAAATC ACGTACAACA GCTACACCAG 11160 
CTATGCCAGT GCCCATAAGC TGATCAATAT TCTCCGAAGT CAAGCCTCCA ATAGCAACTA U220 
CTGGAATGGC AACCGTTTGG CAAATTGTTT TCAAGGTCGA TATCAGAGTA ATGGGCGCAT 112 90 

TTTCCTTGGT GGTGGTTGGG AAAATGGCTC CTGTACCCAA GTAATCTGCA CCTGATTTCT 11340 

CCGCTTCCAG AGCTCTTTTA ACCGTTTTAG CGGTGACACC GAGGATTTTT TCAGGACCCA 11400 

AGACTTTGCG AGCTACCGAA ACTGGTAATT CATCATCTCC GATATGCAGA CCTCCTGCAT 11460 

CAACCGCAAG ACAAACATCC AACCGATCAT CGATTATCAA GGGTACCTGA TAAGCATCTG 11520 

TTATTTCCTT GACTTGTTTT GCCAGTTCAT AATATTGATT GGTTGTGAGA TTTTTTTCTC 11580 

CCAATTGGAC TATGGTAACC CCTGAACCGC AGGCCGTCTC AACTTTTGCA AGAAAGCTTT 11640 

CCACGGAATC TTGATAGCGA TTGGTTACCA GATATACTCT AAGTGCTTCT CTATTCATAA 11700 

ACCTCTCCTT TGATGGTATC TAGCCAATTT TCATCTCTTC TTACCACCGA AAGCTGATTG 11760 

AGTACTTGGT AACGAAATTC TTCCAATCCC ATTCCTTGAA CAACTATTTT CTCAGCAGCG 11820 

ATATTGAGAT AAGAGACTGC TAAGCAAGAA GCTTCAAAAC CAGTCTTTCC TTGGCTGAGA 11880 

AAAACAGCTG TTAAGGCTCC AACCAAGTCT CCTGTCCCTG TTATCCAGTC TAATTCAGTA 11940 

CAGCCATTTC CCAGTACAGC GACCTGATTT TTCGAAACGA CGAGGTCCTT GGGACCTGTG 12000 

ACTAAGAAAG ACATACCAGG ATAGGTCTGA CACCAGTCTT TCAAGACTTG AAGCAAATCC 12060 

TCCGTTTCTT GATCTTTAGC ACTCGCATCG ACCCCAACGC CGTGGTGCTT TAATCCAACA 12120 

AGACTTCGAA TTTCTGACAT GTTTCCTTTA AGGACCGTAG GTCTATAGTC TAAAAGGTCT 12180 
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TTAACTAAGC TCTTACGAAT GGATGAAGTC GTTACGCCAA CCGCATCTAC TACCATCGGG 12240 

AGAGAAGATT GGTTTGCATA CGAAGCTGCC ATGCGGATTG CTTTTTCCTT CTCAGCTGAC 12 300 

AAATGCCCCA AATTGATGAA GAGAGCCTGA CTTTGCTTAG TAAAATCAAG AACTTCACGG 12 360 

GAATCATCTG CCATGACAGG TTTGCATCCC AGAGCCAAAA TCCCATTTGC CAGCATCTCA 12420 

CAAGAAATCT CATTGGTAAT GCAGTGAATG AGGGAACTAG AGCCTATAGG AAAGGGATTT 12480 

GTAAATTCCT GCATCAGTCT ATCCTTTCAC TAAAGAAATA TCCCTGCACT TTTTTAAAGA 12540 

ATTCCTGCTT GATTAAAAAT CGAAAGGCAA TAAAGGAAAT CGCTGTACCA ATCAAGGTTG 12 600 

CTCCGAAAAA TCGAGGCGTG TAGATAAACC AGCTAAGCTT AGCAGCTGAT CCTGTAAAGA 12660 

GTACCATAAC AGGATAGGAA ACAATGGAAC CAATAATACC TGTTCCCAAA ATCTCTCCTA 12720 

GAGCAGAATA GTGAAATTTT CGACCGTACT TATAAAAGAG ACCTGCTAGA AGGGCTCCAA 12780 

AAGTCGCTCC TGTGAGAGCT AAAGGCGGAA TCCCTTGAGT CGTCATACGG ATAAAGGCTG 12840 

TGACTGTAGC CATAGCCAAG GCATAAACAG GTCCCATCAT GATTCCTGCT AGAATATTGA 12900 

CTACACTGGA CATCGGTGCC ATTCCCTCAA TTCGAAAGAT AGGTGTAAGG ACTACATCAA 12960 

GGGCAATCAT CATAGATAAA ATGGTTAATT TGTGAACTTG TAATTCGTGC TTTCTCATGC 13020 

TTCTATTCTT CTCCTTTTTC TAAAGACTGT AAATCGCTCT TCCATGTCTG GTGTTGGTAG 13080 

GCCATTTCCC AAAACTTGGC TTCCATATGA ACACTGATGT GGAAGGCATC TAGCATTTTT 13140 

TGCTTGTCTG TCTCGTCACT TTCTCGATAG AGCTGATTGA CCAGTGCTCC CTCCTCTCTG 13200 

ATCTCTTGCT CTAACTCATC CGTAATATAA GTTTCAATCC ATTGTTGATA GAGAGGATTT 13260 

GGTGATGGTT TAAGATTAAG TGATTTCCCT A7ATCATGGT ATAACCAAGG ACAAGGAACC 13120 

AAGCTTGCAA AAGCGATCGC TAAGTTCGGT TCTGCAAATT GCCTATAAAT ATGAGAAATG 13380 

TAATGATAAC AGGTTGGAGC GATTGGATGT TGCTCCATTT CCTGGTCGCT GATTTCCAAT 13440 

TCCTTGAAAA ATTGTTGGCG AATAAATAAC TCACCCTCCA CTAAACCCTG AGCATTTTGT 13500 

TTCAAGAGTC TTTTCATCTC TTGGTTTGAA GTCTTATCAG CCAAAAGATG ATAGATTTCT 13560 

GAGAAAGCCT TCAGATAGTA GGCATCCTGA ATCAGGTAAT AGCGGAAAAT GCCAGGTTCT 13620 

AAATTCCCCT CTTGTAATTG TAAAATAAAG GGATGATGAA AGGAAGCCTG CCAAGCTTTC 13 680 

TTGGATAATT CCATCGCAAT ATCTGTAAAT TCCATAATAA CTCCTTTATA AAAATAGACT 13740 

GGTTTGAAGC AATAAAAAGA AAAGCAGGTA GATTAATTTT GTTTTTTTAG GAATATAAAA 13800 

AGTCCGATAG CTATTCTTCA ACTGTGCATG TTCGTCATAT CCGTGAGCAG ATAGAGCTCT 13860 

CAGGTAAAGA TGGCGCCACC TAAAGACTGT CATCAGAACC TTACTGTAAA TCAAGGGCGA 13920 

CCAAAAATGT AGTTCTTGAC CACGTAATAG GCAAGCTTCT TTGAGGGACT TGATTTCTTG 13980 



WO 98/18931 



PCT/US97/I9588 



439 



CTGAATGAGA 


GGAAAAGAAT 


TGAATACCAC 


AATCAAGGCA 


TAGGACCAAG 


AGCGTGATAG 


14040 


CCCCTTTTGA 


GCCAAGTACA 


AGAGAAGCTC 


TTTTAGTGAA 


ACAGAGGAAA 


CAAAGACAAG 


14100 


GCCGATACAA 


ACTGTCACAA 


AGGCCCTCGT 


TCCAAGCATG 


ACTGCCTGTG 


AAGCATCTCC 


14160 


GTGTAACTGA 


ACTGCCCAGT 


AGTTGGCAAA 


AGATGGTAAA 


ATGGCAAGTA 


TGATCATCCA 


14220 


AGCTAACATT 


TTAAATCGAC 


GGTAATAGAG 


CATAAAGAGA 


ATACAAAATG 


CGACTACCGA 


14280 


AAGAGTCAGA 


GCAATCGAAG 


GAATGAAAGA 


TGTTTCCAAG 


GATAAAATCA 


GCAAGAAGAG 


14340 


ACTGATAATC 


GGTGTCTGGG 


TTGCTACTTT 


GACCATACTA 


TCTCACCTCC 


CCTTGGGTAT 


14400 


TGCTACTCTG 


AGATGTAAGT 


GGTTTGGTAA 


TGGTCACTTC 


TTTCACATGC 


CGAAGACCCT 


14460 


GACTAGTCAT 


CTCAATCCAA 


TAATCAACCA 


CAGAAATCAA 


AGGGTCTAAA 


CGATGACTAA 


14S20 


TGAGCAGAAA 


ACTTCTTCCT 


TGATTCCTCT 


CCTCCACAAT 


CCACTTGCAA 


AAATAATGGC 


14580 


AGCCTCTATC 


ATCCAAACCT 


GCAAAAGGTT 


CATCTAGCAA 


GATCACGGAA 


GCCTTACTGG 


14640 


TCAAGATCCT 


CAGGAGCTGA 


AGAATTTTTT 


GCTGACCACC 


ACTTAATTGA 


TAGGGACTCT 


14700 


TATCGACTGC 


CTGCTCCAAA 


TCAAAATATC 


GTAAAGCTTG 


AAAAATCCGC 


TGATTTCTTT 


14760 


CAGAATCAGG 


TCCATCTAAT 


TGAAGCTCCT 


CTCGCAGACT 


GACTCCGATA 


AACTGCTTCT 


14820 


CAGCTTCCTG 


AACAACACCA 


GTCAGATCAC 


GATACAAACT 


CTTTTTCTTT 


TTCAGGACCG 


14880 


AACCCTTCCA 


AGTAATGCTC 


CCCTTATACT 


TTTGAAATTG 


AAGAATAGAC 


CGAAAGAGGG 


14940 


TTGATTTCCC 


GACACCATTG 


TCACCCAGGA 


TACAGGAAAT 


CCCTTGATAC 


AATGTGAAAT 


15000 


CAGCAATTGA 


AAAGAGGGGC 


CGATTACCAA 


GCTCACCAGT 


CACACGGTTC 


ATATGGAATA 


15060 


GTTCCGGGCT 


AGAAGCAACT 


TCCTTTGAAG 


CAACCTGTGT 


CATCTCATAG 


GAAGGCATTT 


15120 


GAAACACTTC 


CCTTAGTTTT 


CCGTCTCTTA 


GCTCCACCAT 


ATGGTCGATA 


TACGCTTTAT 


15180 


AGTCAGATAA 


ATCATGCTCG 


CACAAAATAA 


CTGTCTTCCC 


ATCATAGACC 


AACTC TTTT A 


15240 


GAATCTCCAA 


TATCTCGATT 


CTGCTCTTGC 


GGTCAATGGA 


AGCGAAGGGC 


TCATCCAAGA 


15300 


GATAGACCCT 


AGGATTCATG 


GCAAAGAGGA 


CAGCCAGCGC 


TGCTTTTTGC 


TTTTCCCCAC 


15360 


CTGATAAGTG 


ATGGATGAGA 


CGGTGCAAGA 


TGTCCTTGCA 


ACGACATTGC 


TGCACAACCT 


15420 


CTGCTATTTT 


AGAATCAATT 


TCCTGAAGGT 


GATAGCCGAT 


ATTTTCCATG 


GTAAAAACCA 


15480 


ACTCCTCAAA 


CAAGCTCTCC 


ATGGTAAATT 


GATGATTAGG 


ATTTTGCAAG 


AGAATACCAA 


15540 


CCGTCTGGAC 


ACGTTCGACG 


ATAGAAAGCT 


GACTGACCTC 


GCTCCCATCT 


ATCAGGACTT 


15600 


GACCGCTATA 


GGGAAGAGAA 


CTAACTTGGG 


CAATCATTTG 


AAAGAGGCTG 


GATTTTCCAG 


15660 


ACCCACTACT 


CCCAACTAAC 


AAGGTAAAGG 


CTTGCGCATG 


AAAAGTAAAA 


TCAAACGCCT 


15720 
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CAGAGAAGAT TGGGGACTGA ATCGCTCGTA GTTCCAGACC CATCTATGCT TTTCCTCCAG 
TTGCAAACTG ATGATAGAGT TTGACAATGG CACGAACCAA GATGGTACAG AAGAAATAAA 
CAGAAATAAA ACGTACCACA AGCAAGGAAA GGACAAACGG AAGGGAAAAG GCGTAGTAAC 
CTAACTTAAT GTATTCATAG ACAAAGCTAA CAAGCGTAAT CCCAATACTA TTAGCAGTTA 
GAGAGAGCCA ACTTTCATAG CGATTCTTAG TTACGATAAA ACCAAATTCA CTTCCCAAAC 
CTTGAACAAA GCCACACAAA AGAGCTCCTA GACCAAATTG GCTACCATAA AGGACTTCAG 
CAAGCGCAGC TAGCACTTCT CCAATCGTTG CACTTCCGAC TCTCGGAACA AAGATGGCAG 
CAATGGGCGC AGCCATACAC CAGAGACCGA AGAGGATTTC ATTGGCAAAG GCCTGCAAAC 
CAAGACGTGT TAAGAGTAGA CTGAGAATAT TATACACATA TCCTGAACCA ACGAAAACCC 
CACCAAAAAA GATAGACAAG AAAGCAAGCA AGATAACATC TTTTAACTGC CATTTTTTCA 
ACATAAAAAA CTCCTTTTTT TAAAGAAAAG TGAGGCACTC AAGAAGACCG ACCTAAATAC 
TTTGTATAGC AGACTGAATT TAGAACAGTA CACAAGAACA CTAAAATATT TCTAGAAATT 
AATTTGAATT TTCTAATTGA TTTGTTCGCA TCTTATTTCA ATCTACTATA TCATCTTCAT 
CCAGTTTCGT AAAAGAAAAA ACTCTAATTA CAGATACAAA TTAGAGTTCA GCTTACAAGA 
TTAGACAGTT CTTTTCGACA TACGAAAAAA ACATTTCACA TTTCCCTTCG CCAGTCTTAA 
CTGTATCAGG TTCAATGGGT ATCATCTCAG CCTAAAGCAC CCCAAATGTC TTTATTATTT 
AATTATGTGA TTATTATAAC ACACATTTTA TACTAGTTCA AGAAATTGAA CTGGAAATAC 
AGCCTTGCAC TCACAAAGAC AGCAGATCTT TCTTTTGCAA AAAACAAATG ACCTGTTTGA 
TGAATTAGCC ATTCAAGCTG AATCTGGACA TAGCTTTTTA AAAAAGGAAA ATCCTACTTA 
CTTAGAATCC AAGGATAGAT ATCTATTGTT CACTCATTTC CCGAACAGTT TTTTCTATAT 
TTTTTGCATA CGATATTGCC GAAATGATTG AAACGCCATC CATATTGGTC TTTATAATGT 
CTTTAATATG TTTCGTCTGT ATCCCACCAA TTGCAACTAA AGGCATTTGT GGCAATAGTT 
TTCTCATCAA TTCAAGACCT TCATAACCTA TAGTACCACC AGCATCATCC TTTGACTGCG 
TACCAAATAC AGGCCCAACA CCTACATAAT CTACATATTC AACTTTTGAT TGTTGAAATT 
CTTCTTCGTT TCTTATACAA AGACCAATTA TTTTATCTCC CATCAATTTT CTAATTTCAT 
CAACACCAAT ATCATCTTGA CCTACATGTA CGCCATCGGC GTCAATTTCC ATTGCTAAAT 
CTATATCGTC ATTAACGATA AATGGAACAT TGTATTTTTT ACAAAGTTCT TTAATTTGGA 
TAGCTAGCTC AACTTTTTCT AAGCCTTCTA AAGCACCCTC ACCTTTTTCT CGAAATTGAA 
ATAAGGTTAT ACCACCTTTT AAGGCTTCCT CAACGACTGT ATATAGATTT TTTCCTTGGC 
AAGTAGTCGT TCCACAAATA AAATATAGTT TTAGTAATTC TTTATGAAAC ATCTTACTTC 



15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 
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ACTCTTTTGA ATTCCTTTAC ATCTTCATCT GTAATCTCGT ATAAGGCATT TATAAATTCA 17580 

ACTTTAAATG TCCCAGGAAG ATGTCCATTT GGACGTTTTT CTGCTATTTC TCCAGCGATA 17640 

TTGTAAACCA ACACTGCTGT TTTTAATGAT TTCAATTCTT GACCTTTTTC TAGTCCGATA 17700 

AAGCTTGCTA CTACAGCTCC TAATAAGCAT CCTGTCCCAA TGACTTTCGG CATCATAGCA 17760 

CTACCATTAT GAATCATTAC CACTTCTCCA TTAACAGCAA TGGCATCCAC TTCACCTGTT 17820 

ACT AC T ATTG GAATATTGAA CTTCTCATTT GCTGCTAGAG CAATTTCGTC AATATTATCT 17880 

ACGCCCGCAC TATCTACTCC TTTAGATGCC ACATCTATTC CTACTAAAGA GGCAATCTCG 17940 

CCAGCATTTC CTCTAATCGC TGCTAGTTTA TAATTGTTGA TTAGATCATC TGCTACTTTT 18000 

TTTCTATATT CTCCTGCTCC ACAGGCTACA GGATCTAAAA CTGCTGGGAC ATTATATTTC 18060 

TCTGCAATTT TCAGAGCAGC TTGGTATAAT TTCCAATTTT CATCTGTCAA TGTTCCTATG 18120 

TTTATTAATA AACCACCAGC ATACTTTAAC AAATCCTCTA AATCTGCTGG AAACTCACTC 18180 

ATGGCTGGTG AGGCGCCCAG TGCTACTAAT CCATTTGCTG TGAAATTTTT TACTACATCA 18240 

TTGGTTATAC AAATGACCAA TCGTCCTTTT TCTTTTAATA ATTTTAAACT TGT CAT ATTG 18300 

AAATCCTTCC TTTTCACTTT ATACGATCTA CTAATTTCGA TTTATCTTTA GTTGAGAATT 18360 

TTTTTCATTT ACATTGAATG ATTTATACTC AATGAAAATC AAAGAGCAAA CTAGGAGCCT 18420 

AACCGCAGGT TGCTCAAAAC ACTGTTTTGA GGTTGTGGAT AGAACTGACG TGCTTTGAAG 18480 

AGATTTTCGA AGAGTCTTAC CTCATCAAAT TTGTAAATAT CATCAGCCTT CTCTAGACAT 18540 

CCTAACCAAT ATCAAAAAAA GCTAATTCTA AAGCCACTGC TTGATTCCAC CGTTCCTGAA 18600 

GTTCTGTCAA ATCTTCTCGA TTTTTACCGA CACGATTGAG TTCGTCAACC AGAAATTGAA 18660 

CCCACTCTGC AAAGAAAGGA CCTCTGTGGA GATTGATCCA TTCCGAATGA ATATAGACTT 18720 

CAGGTAAAGC CAAATCTTTA GAACCCCAGT CTAAATAGAG ACCTTCTGCA ATGACCAGCA 18780 

TGACCAAAAG ATGGGCATAG TCTGATGAAG CCACCGCCGA ATACATTAGA TCCTGAAAGG 18840 

CTTTTGTTAC AGGGTGCAAA GTCACTTCTA GATAGTCATT CTCTGCTACT TTTAACTCTT 18900 

TAAAAGCCTT TTGGAAATAA CCATCTTCAT CTGCTTCAAG AAAGCCTAGT TGCTTGGCAA 18960 

AACGAAGCTT GGATTCAAGT TTATCTGCGT GACTACGCAG GCACCCAGCA TGGATAAGAA 19020 

GGCATCAAAG AAGTGATAAT CTTGAATCAG ATAGTCCTTT AAGACCTTAT TCTCAATTGT 19080 

CCCCGCAAAA AGTTCCTTAA CAAAACGATG A7TGATTGCA GCCTGCCAAT CCTTCTGACT 19140 

GCTTTTTAAT AATTCTCCAA CAGTCAAACC TGGCTGAAAT GCATAGTCTT GTGTTTCCAT 19200 

ATTTACTTCT CCTCTCTTTA CTTGTTAGTA ATTAATAAAA CACCAAGAAA TATCAAGCAA 192 60 
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AATCGTAATT CCACTTGATC CTTTTAAAGC ACATCGAGAG CftTT* -AGA GAGCTAACTA 19320 

AACAAGCCTA TCCAGTTTAT ATAAACAAAA AACTCCAATT ACAATCAAGA ATTAGAGTTG 19 3 80 

ACTTACAAGA TTAGACCGTT CATTTCACCA TACGAAAAAA CTGTTCACAT TTCCCTTCGC 19440 

CAGTCTTAAC TGTATCAGGT TCAATGGGTA TTATCTCAGC CTAAAGCACC CCAAATGTCT 19500 

CTATTATTTA ACTACTGAAC CAGTATAGCA AAAAATGAAA GCCCTAGCAA GATATTTGAC 19560 

CGAAAAATAT CTTTATATAT AATATATTGA AACTAGAATA GTACACCTCT ACTTATAAAA 19 620 

CATTGTTAGA AATCGATTTG ACTGTCCTGA TTGATTTGTC CTATTCTTAT TTCATTTTAC 19680 

TATAGTTTTC GATAGCAATT TATTCTTCCA ATACACGAAG AAAAACCTCC ACATTCAGTG 19740 

GAGGCAATCT GTTTTATCAA TACAATTTTA AGTCACGAGG GTCAACTGGG AAGGTTGGGT 19800 

TGTATGGATT GTGACGGAGC TTGAAGTGTT TGACATCTTC AATGGTCTGA GTTCCAGACA 19860 

ATTGCATAAC TGTCTTCAAT TCCGCATTCA AGTGTTCAAA GACTTGACGC ACACCGACAC 19920 

TACCACCGAG AGCCAAGCCA TAGATGACAG GGCGTCCAAT AGCAACCAAG TCTGCTCCTG 19980 

ATGCCAAGGC TTTAAAGACG TGTTGACCAC GACGAACACC AGAGTCAAAG ACAATCGGCA 20040 

CACGTCTATC AACTGCTTCT GCCACTTCTT GAAGCGAGTC AAAGGCAGCT GGTCCACCGT 20100 

CGATTTGACG ACCACCGTGG TTGGTTACCC AGATACCAGA AGCTCCTGCA GCAAGCGAAC 20160 

GTTCAACGTC CTCACGGCAT TGTGCTCCCT TGACATACAC AGGAAGACCA GAGTATTCAG 20220 

CGATAAATTC TACATCGCGT GGAGACAAGC GTTGTTTAGC TGATTTGTAA ACAAACTCCA 20280 

TTGATTTACC AGCACCTTCT GGCAGGTATT CTTCAACAAT CGGCATGCCA ACTGGGAAGA 2 0340 

CAAAACCATT ACGCTTATCC ACTTCACGAT TCCCCCCTAC AGTAGCATCT GCCGTCAAGA 20400 

CAATCGCTTT ATAACCTTCA GCCTTCACAC GGTCCATGAT GTGGCGGTTG ATACCGTCAT 20460 

CCTTACTAAA GTAAAATTGA AACCAATGAG GTGTCCCTTG GAGGGCTTCA GAAATCTCTG 20520 

GAAGGTCAAC AGTAGAGTAA GAACTGGTTG TATAAAGAGA ACCAAACTCA TGCACACCAC 20580 
GCGCAGTCGC CACTTCCCCC TGTTCATTTG CCAATTTATG AGCCGCAACA GGTGCCATAA 20640 
TGATTGGAGA AGATAGTTTT TCACCTGCAA ATTCAATCTC TGTACTTGGA TTTTCTACAT 20700 
TGCAAAGTCT ATGAGGAACG ATGAGCTTGT GGTTAAAGGC ACGGATATTC TCTCTTAAAG 20760 
TGAAAGTATC TTCCGCCCCA CTAGCGATAT AGCCAAATGC TGCTTTAGGA ATAACTTGTT 20820 
GCGCCATTGG CTCCAAATCA TAGGTATTGA TGAArTCTAC ATGACCTTCT GCATTGCTTG 20880 
TTTTGTATGA CATAAAATGT CCTCCTTAAT AAGTAAGCGT TTACTTTGTG TATTACAAAA 20940 
ATATCTTAAC TCTTTTTCAA AACTTTTAAA ATATTTTGTT TGGAAATTTC AGAAATTTTA 21000 
TGTCTATGAT AAAAATCCTT ATAACGGCAA TAAAAAATAG ATATTATCCA AAGAAGATTT 21060 
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TAAGTGCTAC AATAACTGTA TTATTTCTAG ATGGGAGGTT CTATTTTTGG ATTGATCCAT 21120 

TGTTGAACAA TATCTACCAC TATATCAAAA GGCATTCTTT CTGACCTTGC ATATTGCAGT 21180 

TTGGGGAATT TTGGGATCCT TTCTGCTCGG TTTAATCGTT AGTATCATCC GACATTATCC 21240 

AATCCTTGTT TTGGCGCAAG TAGCGACAGC CTACATTGAA TTGTCACGTA ATACGCCCCT 21300 

TTTGATTCAA CTCTTCTTTC TCTACTTCGG TCTTCCCCGA ATCGGGATTG TCCTATCTTC 21360 

AGAAGTCTGT GCAACGCTTG GGCTTGTCTT TTTAGGAGGC TCCTATATGG CAGAATCTTT 21420 

CCGAAGTGGG CTGGAAGCCA TCAGTCAAAC CCAGCAGGAG ATTGGCCTCG CTATTGGTCT 21480 

GACACCTCTA CAGGTCTTTT ACTATGTGGT TCTTCCGCAA CCAACAGCGG TGGCACTCCC 21540 

CTCCTTTAGT GCCAATGTCA TTTTCCTTAT CAAGGAAACC TCTGTTTTCT CAGCAGTGGC 21600 

TTTGGCCGAC CTCATGTACG TCGCCAAGGA TTTGATTGGT CTCTACTATG AGACAGACAT 21660 

TGCGCTAGCT ATGTTGGTAG TTGCTTATCT AATCATGCTG CTACCCATCT CACTGGTCTT 21720 

TAGCTGGATA GAAAGGAGGC TCCGCCATGC AGGATTCGGG AATCCAAGTA CTCTTTCAAG 217 80 

GAAATAATCT CCTGAGAATC TTACAGGGAT TCGGCGTTAC GATTGGGATA TCCATCCTGT 21840 

CTGTCCTCTT ATCCATGATG TTCAGAACAG TCATGGGAAT CATCATGACC TCCCATTCTA 21900 

GAATCATACG ATTTTTAACA CGATTGTATC TGGAATTTAT CCGTATCATG CCCCAGCTGG 21960 

TGCTACTCTT CATCGTTTAC TTTGGCTTGG CTCGAAACTT TAATATCAAT ATCTCAGCTG 22020 

AGACTTCAGC TATTATCGTT TTTACCCTCT GGGGAACAGC TGAAATGGGA GACTTGGTAC 22080 

GTGGAGCTAT CACTTCTCTC CCTAAACATC AGTTTGAAAG TGGACACGCA CTCGGCTTGA 22140 

CTAATGTTCA ACTTTACTAC CACATCATCA TCCCACAAGT CTTAAGAAGA CTGCTACCGC 22200 

AGGCTATCAA TCTTGTCACT CGGATGATTA AAACCACTTC ATTAGTTGTT TTGATTGGGG 22260 

TTGTGGAAGT GACCAAAGTT GGACAACAAA TCATCGATAG CAATCGCCTG ACCATCCCAA 22320 

CTGCTTCATT TTGCATTTAT GGAACCATTC TAATCTTATA TTTCGCACTT TGCTACCCTA 22 380 

TTTCCAAACT ATCCACTCAC TTAGAAAAAC ATTGGAGAAA CTAAATGTCT GAAACTATCT 22440 

TAGAAATCAA GGAACTAAAA AAATCCTTCG GAGACAATCC CATCCTCCAA GGACTTTCTC 22 500 

TAGAAATCAA AAAAGGGGAA GTTGTTGTCA TCCTAGGGCC ATCTGGTTGT GGGAAAAGTA 22560 

CCCTCCrrCG TTGCCTCAAC GGCTTAGAAA GTATTCAAGG TGGAGATATT CTTCTGGATG 22620 

GTCAGTCTAT CGTTGAAAAT AAAAAAGATT TTCACCTAGT TCGCCAAAAG ATTGGCATGG 22680 

TCTTTCAAAG TTATGAACTC TTTCCCCATC TGGATGTCTT ACAAAACCTC ATCCTAGGCC 22740 

CTATCAAAGC TCAAGGAAGG GACAAGAAAG AAGTAACGGA AGAAGCTTTG CAATTACTAG 22800 
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AGCGTGTCGC 


TTTGCTGGAT 


AAACAACATA 


GCTTTGCCCG 


TCAATTATCT 


GGTGGACAGA 


22860 


AGCAACGTGT 


TGCAATTGTC 


CGTGCCCTCC 


TAATGCATCC 


AGAAATCATC 


CTTTTTGACG 


22920 


AGGTGACTGC 


TTCGCTGGAT 


CCAGAAATGG 


TGCGTGAGGT 


GCTGGAACTT 


ATCAATGATT 


22980 


TGGCCCAAGA 


AGGCCGTACC 


ATGATTTTAG 


TAACCCACGA 


AATGCAGTTT 


GCCCAAGCCA 


23040 


TTACTGACCG 


GATTATCTTC 


CTCGACCAAG 


GGAAAATCGC 


TGAAGAAGGA 


ACAGCTCAAG 


23100 


CCTTCTTTAC 

A * V* AAA. 4» 


CAATCCGCAA 


ACCAAACGAG 


CCCAGGAATT 


TTTAAACGTC 


TTTGACTTTA 


23160 


GCC AATTCGG 


CTCATATCTA 


TAAAGGAGAT 


TCTTATGAAA 


CTATTCAAAC 


CACTCTTAAC 


23220 


TGTTTTAGCA 


CTTGCCTTTG 


CCCTTATCTT 


TATCACTGCT 


TGTAGCTCAG 


GTGGAAACGC 


23280 


TGGTTPATCC 


TCTGGAAAAA 


CAACTGCCAA 


AGCTCGCACT 


ATCGATGAAA 


TCAAAAAAAG 


23340 


rYMTPGAAPTG 


CGAATCGCCG 


TGTTTGGAGA 


TAAAAAACCG 


TTTGGCTACG 


TTGACAATGA 


23400 


TfifZ TTf* TT & 


CAAGGCTACG 


CTACGATATT 


GAACTAGGGA 


ACCAACTAGC 


TCAAGACCTT 


23460 


f^TCTCAAGG 

X nJ X 


TTAAATACAT 


TTCAGTCGAT 


GCTGCCAACC 


GTGCGGAATA 


CTTGATTTCA 


23520 


AAPAAGGTAG 


A T ATT AfT CP 

t\ 1 /» 1 X r%V» (1.1 


TGCTAACTTT 


ACAGTAACTG 


ACGAACGTAA 


GAAACAAGTT 


23580 


fi&TTTTnrrr 


TTfVATATAT 


GAAAGTTTCT 

UiWtU ALA V« A 


CTGGGTGTCG 


TATCACCTAA 


GACTGGTCTC 


23640 


nil t\\m f\\J r\\^\j 


TV A A APAAf T 


TGAACGTAAA 


ACCTTAATTG 


TCACAAAAGG 


AACGACTGCT 


23700 


CI Afl A f*TT A TT 


TTGAAAAGAA 


TCATCCAGAA 


ATCAAACTCC 


AAAAATACGA 


CCAATACAGT 


23760 


G A fTf^TT Af*f* 


AAGCTCTTCT 


TGACGGACGT 


GGAGATGCCT 


TTTCAACTGA 


CAATACGGAA 


23820 


Cl'VVt TAf^CTT 

*J X X \_ I /%VJV* 1 I 


GGGPGCTTGA 


AAATAAAGGA 


TTTGAAGTAG 


GAATTACTTC 


CCTCGGTGAT 


23880 


CCCGATACCA 


TTGCGGCAGC 


AGTTCAAAAA 


GGCAACCAAG 


AATTCCTAGA 


CTTCATCAAT 


23940 


AAAGATATTG 


AAAAATTAGG 


CAAGGAAAAC 


TTCTTCCACA 


AGGCCTATGA 


AAAGACACTT 


24000 


CACCCAACCT 


ACGGTGACGC 


TGCTAAAGCA 


GATGACCTGG 


TTGTTGAAGG 


TGGAAAAGTT 


24060 


GATTAGTCAT 


TAACTCTTAA 


AAGGAACTGG 


ATTTTAAGCT 


CCAATCCCTT 


TTTAAGATTT 


24120 


TACCTATAAC 


ATCCTGAGTC 


TATCTAAGAT 


CTTCAATCTG 


AACACAGTGT 


ACATACITl'A 


24180 


TCTTCTATTG 


CATATACTTT 


ATCACATAAG 


ATACGAATAT 


CCTCTTCACT 


ATGACTAGCA 


-1240 


ATCAAAATTG 


TTGTCCCTTT 


TTCACTAGAG 


AGCTTTCTAA 


ACAATGTTCT 


CATATTTTCT 


24300 


ACACTTGATT 


TATCCAAGGC 


ATTCATAGGT 


TCATCTAGTA 


AAAGAATAGA 


GGGATTCTCC 


24360 


ATAATTGCTT 


GAGCAATCCC 


TAGCTTTTTC 


CTCATACCTA 


GCGAATAAGT 


TTTAACTTTC 


24420 


TGGTCTTTTT 


GCTCATATAG 


ACCAACTATT 


TTCAGTGTAT 


CATTGATTTC 


CTGATTACCA 


24480 


ACTACTCCTC 


GTATCCTTGC 


CAAATATTGT 


AAATTCTTAA 


AGCCACTATA 


ATAATTTATA 


24540 


AAACCAGGTT 


CTTCAATCAA 


AGCTCCCAAA 


TTACCTGGAA 


TTTTTCTCTC 


AGGAACAATA 


24600 
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TTTTCCCCAT TGATTAACAC TTCTCCATAA GACGGACTAT ATAAACCAGC TATTAATTTA 24660 

AACAATACAC TTTTCCCTGA GCCATTCGCA CCAGTAATTC CTATAATTTC CCCCTGTTTA 24720 

CAACTAAAGT TAAGGTTTTG AAAAACACAT GTCTTTTTTA ATTTCAACTC AATATTTTTT 247 BO 

AATGTAATTA TTTCATTCAT TCTATAAACC TCCTCTTTTG ACGAGTGAAA TAGAAAATGC 24840 

TTTGAAAAAG AAAGACTAAA AATAGCAACT GAAGAAATAA ATCTCGTCCT ATATCTCCAT 24 900 

TCCCTCGATT CAAAATATAA AATAGATAAT TAGTTCGATT TCCTACAAAT AGACCACCAA 24960 

ACACAATCAT GAGTAAAAAG AAACTAACGC AAGCAAAGTT CG 25002 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11443 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



CAGGTACGGT GAGGCGCAAC TAAAATATAA TTTTCATCTT GATTAGGAAT TTTATCAGTA 60 

TTATGATAGT GAGCATTGCC ATTGATGGAC CATAAGAGCA ATACAACTAA TCCACGCAAA 120 

TAAGTATAAA ACATGCGATC TCCTTCGATT GTTTTCTTGT TATTATTATA CCTTATCAAA 180 

GGAGGGCTGG CAAACTTTTC CCTTGACTAG ATACATATTT AGGATGAAAT TAGAATTCTG 240 

TTAAAAAAAA TGATATAATA GAATTTATGG ATAAAAATAA GATTATGGGA TTAACCCAAA 300 

GAGAAGTCAA GGAAAGACAG CCTGAGGGTT TGGTCAATGA CTTTACCGCA TCACCCAGTA 360 

CCAGCACTTG GCAAATCGTT AAACGAAATG TCTTTACCCT TTTTAACGCT TTGAACTTTC 420 

CCATTGCTTT GGCTCTTGCC TTTGTGCAGG CTTGGAGCAA TCTGGTCTTC TTTGCTGTTA 480 

TCTGCTTTAA CGCTTTTTCT GGGATTGTGA CCGAGCTACG AGCCAAACAC ATGGTGGACA 540 

AGCTCAATCT CATGACCAAG GAAAAGGTCA AAACCATCCC TGATGGTCAG GAAGTTGCTC 600 

TTAATCCTGA AGAATTAGTG CTAGGAGATG TCATTCGTTT GTCTGCAGGA GAGCAGATTC 660 

CTAGTGATGC CTTGGTTTTG GAAGGCTTTG CGGAAGTCAA TGAAGCCATG TTAACGGGAG 720 

AAAGTGATTT GGTGCAAAAG GAAGTTGACG GCTTACTTTT GTCAGGAAGT TTCCTAGCCA 780 

GTGGGTCAGT TTTATCTCAA GTTCACCATG TCGGTCCAGA CAACTATGCT GCCAAACTCA 840 

TGCTTGAGGC TAAGACCGTT AAACCCATCA ACTCCCGTAT CATGAAATCG CTGGACAAGT 900 

TGGCTGGTTT TACTGGGAAG ATTATCATTC CCTTTGGTCT GGCTCTCTTG CTGGAAGCCT 960 
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TGCTTTTAAA AGGCCTGCCT CTCAAGTCAT CCGTTGTAAA CTCGTCGACA GCTCTTTTGG 1020 

GAATGTTGCC TAAGGGAATT GCCCTTTTGA CCATTACTTC GCTCTTGACT GCAGTGATTA 1080 

AGTTGGGCTT GAAAAAGGTC TTGGTGCAGG AGATGTACTC TGTTGAGACC TTGGCGCGCG 1140 

TGGATATGCT CTGTCTGGAC AAGACGGGTA CCATCACCCA AGGAAAGATG CAGGTGGAGG 1200 

CTGTTCTTCC GTTGACGGAA ACGTATGGTG AAGAGGCTAT TGCCAGCATC TTGACTAGCT 1260 

ACATGGCCCA TAGTGAGGAT AAGAATCCAA CTGCCCAAGC CATTCGCCAG CGTTTTGTGG 1320 

GAGATGTTCC TTATCCTATG ATTTCCAATC TTCCCTTCTC GAGCGACCGC AAGTGGGGGG 1380 

CTATGGAGTT AGAAGGCTTG GGGACAGTTT TCTTAGGGGC ACCTGAGATG TTGCTTGATT 1440 

CTGAAGTCCC AGAAGCTAGG GAGGCCTTGG AGAGAGGATC ACGTGTCTTG GTCTTAGCTC 1500 

TCAGTCAGGA GAAATTAGAC CATCACAAAC CACAGAAACC ATCTGATATT CAGGCTCTAG 1560 

CCTTGCTGGA AATCTTGGAC CCCATTCGAG AGGGAGCAGC AGAGACGCTG GACTATCTCC 1620 

GTTCTCAGGA GGTGGGACTC AAGATTATCT CTGGTGACAA TCCAGTTACG GTGTCCAGCA 1680 

TTGCCCAGAA GGCTGGTTTT GCGGACTATC ACAGCTATGT AGATTGCTCA AAAATCACCG 1740 

ATGAGGAATT GATGGCCATG GCGGAGGAGA CAGCTATTTT CGGACGTGTT TCCCCTCATC 1800 

AAAAGAAACT CATCATCCAA ACGTTGAAAA AAGCGGGACA TACAACGGCT ATGACAGGGG 1860 

ACGGGGTTAA TGATATCTTG GCCCTTCGTG AGGCGGATTG TTCTATCGTG ATGGCGGAGG 1920 

GGGATCCAGC AACCCGTCAG ATTGCCAATC TGGTTCTCTT GAACTCAGAC TTTAATGATG 1980 

TTCCTGAGAT TCTCTTCGAG GGTCGTCGCG TGGTCAATAA CATTGCCCAC ATCGCCCCGA 2040 

TTTTCTTGAT AAAGACCATC TATTCCTTCC TGTTAGCAGT CATCTGTATT GCCAGTGCTT 2100 

TACTAGGTCG GTCAGAGTGG ATTTTGATTT TCCCCTTCAT TCCGATCCAG ATTACCATGA 2160 

TTGACCAGTT TGTGGAAGGT TTCCCACCAT TCGTTCTGAC TTTTGAGCGA AATATCAAAC 2220 

CTGTTGAGCA GAATTTCCTC AGAAAATCCA TGCTTCGTGC CCTACCAAGC GCTCTCATGG 2280 

TCGTCTTCAG CGTCCTGTTT GTGAAAATGT TTGGCGCGAG TCAAGGTTGG TCTGAGTTAG 2340 

AAATCTCAAC TCTACTCTAT TATCTCTTGG GGTCAATTGG TTTCTTATCC GTATTTAGAG 2400 

CCTGCATGCC ATTTACCCTA TGGCGTGTCC TCTTGATTGT TTGGTCAGTA GGAGGTTTCC 2460 

TAGCCACAGC TCTCTTCCCA AGAATTCAAA AACTGCTTGA AATTTCAACC TTAACAGAAC 2520 

AAACGTTGCC TGTTTATGGT GTCATGATGT TGGTCTTTAC CGTGATTTTC ATCCTGACCA 2580 

GTCGTTACCA AGCGAAAAAA TAAATCAAAA CCACCAGTGT GAACTGGTGG TTTGTTCTGC 2640 

GGCTATAAGC CGCTTCTACC GGCCAGGGCC AAAGGCCCAC CGAAATAGCT TCCTCGCGCA 2700 

CCACTTTCCC GAGCAGGTGC TAAAGCACCT TAGTTACTTC CTCTTATTTA TTTCGCCAGT 2760 
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AAACGGATCT 


ACTGACTCGA 


ATAACGTGAG 


CTGGTCTGCT 


ACTCTGTCTT 


CTTGTAATTG 


2B20 


ATTCTGAATA 


TATTCAGCTA 


TCACTTTCTG 


ATTACGGCCT 


ACCGTATCTA 


CATAATAGCC 


2880 


TCTACACCAA 


AACTTGCGAT 


TGCCATATTT 


GTATTTTAAA 


TTCGCATGCT 


TATCAAAAAT 


2940 


CATCAAACTG 


CTCTTGCCCT 


TTAAATAGCC 


CATAAAGGAC 


GAAACACTAA 


GTTTCGGAGG 


3000 


AATACTGATA 


AGCATGTGAA 


TATGGTCTGA 


ACAAGCATTC 


GCTTCATGGA 


TTATTACACC 


3060 


CTTACGCTCA 


CATAAGTCAC 


GTATGATTCT 


TCCGATACTA 


CCTTTGTATC 


TGCCATAAAT 


3120 


GATTTGACGA 


CGATATTTGG 


GTGCAAAAAC 


AATATGATAT 


TTACAATTCC 


ATGTGGTATG 


3180 


TGATAAACTT 


TGATTATCCT 


CTCTCATGAG 


GTACCTCCTG 


TATCATATGT 


TGTAGTGGCG 


3240 


GAGAAACCAC 


TTCTATCTTA 


TCATTTTAGG 


AGGTTCTTTT 


TGTTACCACG 


CTAAAAGCTC 


3300 


TATGGAAc C A 


CTAGCATAGC 


TAGTGGTTTT 


CGGGAGACAA 


CAAGAAAGAC 


TGCAATCTGT 


3360 


GGATTGCAGT 


TTTTTATACG 


ATGGATCTAT 


CCTAGATCTG 


ATGTGCAAGG 


CCTACGTGCC 


3420 


GATCATCTAT 


CGGTGAACCC 


AAGAGCGACC 


CTCAAGCCTG 


CTTGGATTGA 


GGTAATAGAT 


3480 


TCAAATATCT 


GTAGTTAGAC 


TATTTGAACT 


TTGATGTAAG 


AAAGAGAAAG 


CGACAGATTG 


3540 


AAGTAATTTT 


AACTCTCTTC 


TATTGCTAGA 


ACAAATGGTC 


GGATAGGTTG 


GTAGTTTGAA 


3600 


AATGAAGATG 


CTATCTATTG 


TTAAATCGAA 


CATAGTGTTA 


TTTATTAGAA 


AATCGTTTGG 


3660 


TTTATTTCTT 


ATCAAATACG 


AAAAGCAACT 


TAAATATTTC 


AACTAAAATA 


GATGTTATGA 


3720 


AGAAAAGGTA 


AAATGATTTT 


GGCATAGTGA 


GGTTCTGTTC 


TATTTGATAT 


CATATTTTTG 


3780 


ATAAAAACAA 


AAATGTCCAT 


TGCAAAGGAC 


AAAATGCGAA 


GTATATTATT 


TTTTGAAAGC 


3840 


GATATAATGG 


ATTCATAAAG 


GAGGTGTATC 


GTGTCTAGAA 


AACAAGAACA 


AATGGAAACG 


3900 


TTGTTGCTCC 


TTTTGCGACA 


TAGTAAGGAT 


TATATATCTG 


CTAAAGTATT 


GGGAGAAAAA 


3960 


TTAAATTGCT 


CTGATAAAAC 


GGTTTATCGC 


CTTGTCAAGG 


GAATCAACAA 


AGATTGTCCG 


4020 


GTAGAAGCAT 


TCATTTTATC 


TGAAAAAGGC 


AGAGGTTTCA 


AATTAAATCC 


AAGAAGTTCC 


4080 


CTCGTGGACG 


TTGATGGGAA 


TTTTACAGAG 


GCTTTTGATC 


CTGAAGTAAG 


GCGTGAAAAA 


4140 


TTACTAGAAC 


GTCTCTTGTT 


GACTGCTCCT 


AAGCCACATT 


CTATTTATGA 


TTTAGGAGAG 


4200 


GAATTCTACG 


TAAGCGAGTC 


AGTAGTACTA 


AAAGATCGTC 


AGATATTACA 


AGAGAGTCTA 


4260 


GCAATTTATG 


GGTTAGATTT 


AAAAATGAGA 


CAACGAAAGC 


TTTTTATTGA 


TGGGGATGAG 


4320 


GCTCAAATTC 


GTTCAGCCAT 


TCTAAATCTA 


CTGCCAATGT 


TTAATCAGTT 


GGATTTAGAG 


4380 


CAAATTACAC 


AGAATAAGGT 


TCAGCCTCTT 


GACGGAGAAC 


TTGCTCACTT 


TTGTTTGGGA 


4440 


TTACTGATTA 


CACTTGAGAG 


AGAATTGGGG 


GTAAACATTC 


CCTATCCATA 


, TAATATAAAT 


4500 
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ATTTTCTCTC ACCTGTATAT TTTTATCAGT AGGAATCGTC GTAGTACTAG TATTCATGTT 4560 

CTAGCACCTT CAAAACCTAC TATTGTTGAT GAGAAAATTT ACAGTGTCTC TCAAAAAATT 4620 

ATTCAAGAAA TTGAACAATA TTTTAGGATG AAGGTTGATG CAGTTGAGAT TGACTATCTT 4680 

TATCAATACG TTGTATCTTC GAGATTGCAA AAACCATTTT CTTCCGGGAA GCTTCCTTTT 4740 

TCTCAGCGAG TTTTAGATGT CACTCATTAC TATTTTAGCC GTKTGTGTAT GGACAATAGA 4800 

GAGATTGAAA CGACAGATCC TGACTTTGTT GACTTGGCGA GTCATATCAG TCCCTTACTG 4860 

AGGAGATTAG ATAATAGAGT ACAGATTAAG AATAGTCTTT TATCACAAAT TCTTTTAACC 492 0 

TATCCTAATC TGGTTAAAGA GTTAACAACT ATTTCTAAAG AAGTGAGTCT AGTATTTGGT 4980 

TTTGCTTCCT TGAGTCTGGA CGAGATTGGT TTTCTAGTCT TATATTTTGC ACGGTTTCAA 5040 

GAAAAGCGAG CACGTCCTCT AAAAACAGTA GTGATGTGTA CATCAGGTGT CGGAACTTCA 5100 

GAGCTTTTAC GAGCACGATT AGAAAAGCAA TTTTCTGAAT TGGATATTAT TGATGTAGTT 5160 

CCTTATCATC AATTAGATGA GCTGATAAAT CTATATCCAG ATTTAGATTT CATTGTGACG 5220 

ACGGTAGCTT TGCAGGAACC AGCAAGTGTC CCGTTTGTCC TAGTTAGTGT TTTTCTAACC 5280 

GAGGGTGATA AACAACGTCT TCAAGCAAAA ATTCAGGAGA TAAACTATGA ATAATCTTTC 5340 

GCTTGTCCTT ATGGATATAT CTGTTCAAAA TCGTCAAGAA GCCTACAAAG AATTAGCAAA 5400 

TCAAATCAGC CTTCTTGTTT CTGAAGATAC AGAAAAAATA GAAGACCTTC TATATTACCG 5460 

TGAGAGACAG GGAAGTATAG AGGTTGCTAA AGGTGTTCTT CTACCACATT GTGAAGCAAA 5520 

CTTTCAACAT CATGTCTTAG TGATTACTAG ATTAAAATCA CCTATCAGAG AATGGTCGAA 5580 

GGATATCCAG TGTGTTGACC TTATTATCGG TTTGGCCATT GCAGTATCAC AGGACAAGTC 564 0 

ATGTATTAAA ACATTGATGA GAAGACTAGC AGATGAATCA TTCATAAATC AATTAAAACA 5700 

GTTAACAAAA GAAGAATTAC GGGAGATAAT ATATGGAAAT CAAAGATATT CTTAATGTGA 5760 

GTCTGATCCA GACGGATTTA CAGATGCAGA GCAAAGAAGA GGTTTTTGAG GCATTAGCTC 5820 

AACTATTGGT TGAGACGGGT TATGTGTCTG ATAGAGACCA ATTTATCGAA GGTCTTTATC 5880 

AGAGAGAGGC AGAAGGACAG ACCGGTATTG GGAATTATAT TGCTATTCCC CATAGCAAGA 0940 

GTTCTGCTGT GGAGAAGGCG GGGGTAGTCA TAGCTATAAA TCACAATGAG ATTCCTTGGG 6000 

AGACCATTGA TGGGAAACCC GTCAAAGTAA TTGTACTCTT TGCAGTTGGT GATGATACAG 6060 

AAGCTGCTAG GGAGCATTTG AAGACCTTAT CACTCTTTGC TCGAAAACTT GGTAATGACG 6120 

AAGTTGTTGC CAAATTAGTT CGGGCTCAGA CATCTGATGA TGTGATTGCA GCTTTTTGTT 6180 

AATAAGAAAA AATTTTGGAG GGTATCCGTA TGAAAATTGT TGGTGTTGCA GCTTGTACTG 6240 

TGGGAATTGC CCACACTTAT ATTGCACAGG AAAAATTAGA GAATGCCGCA AAGGTAGCTG 6300 
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GACATGTGAT 


TCATGTTGAG 


ACTCAGGGGA 


CAATAGGGGT 


AGAAAATGAA 


TTGAGTCAAG 


6360 


AGCAGATTGA 


TGCAGCGGAT 


GTAGTTATTT 


TAGCAGTTGA 


TGTTAAGATT 


TCTGGTATGG 


6420 


AACGCTTTGA 


GGGTAAAAAG 


ATTATCAAGG 


TTCCAACAGA 


AGTGGCAGTC 


AAATCTCCCA 


6480 


ATAAACTGAT 


TGCTAAAGCT 


GTTGAGATTG 


TTACGAAATA 


ACTGAAAATA 


TTTAAGGAGA 


6540 


AAATATATGT 


TGAAACACTT 


AAACTTAAAA 


GGTCACTTAT 


TGACAGCCAT 


TTCCTATATG 


6600 


ATTCCAATTG 


TTTGTGGTGC 


AGGATTCTTA 


GTTGCCATTG 


GTTTAGCAAT 


GGGGGGTGGT 


6660 


GTTCCTGACG 


CTCTTGTAGC 


AGGAAAATTC 


ACTATCTGGG 


ATGCTTTAGC 


AACTATGGGT 


6720 


GGTAAAGCCC 


TTGGTCTCTT 


GCCACTTGTT 


ATTGCTACAG 


GTTTGTCTTA 


CTCGATTGCT 


6780 


GGTAAGCCAG 


GGATTGCACC 


AGGTTTTGTT 


GTTGGTCTAA 


TTGCCAATTC 


TGTTGGTTCA 


6840 


GGGTTTATCG 


GTGGTATCTT 


GGGAGGTTAT 


ATAGCTGGTT 


TCTTGGTTCA 


AGCGATTATT 


6900 


AAAAAGGTCA 


AAGTACCAAA 


CTGGATTAAA 


GGTTTAATGC 


CAACCTTGAT 


TATTCCTTTT 


6960 


GTAGCCTCTT 


TGGTAAGTAG 


TTTGATTATG 


ATTTATATTA 


TTGGAGCGCC 


TATCCCAGCC 


7020 


TTTACCAACT 


GGTTGACGAG 


CTTATTACAA 


AGCTTGGGAA 


GTGCTTCAAA 


TGGTTTGATG 


7080 


GGGGCAGTTA 


TTGGAATTCT 


CAGTGCTCTT 


GACTTTGGTG 


GCCCACTTAA 


TAAAACAGTC 


7140 


TATGCGTTTG 


TGTTGACTTT 


ACAGGCTGAA 


GGTGTGAAAG 


AACCATTGAC 


TGCTTTACAA 


7200 


TTGGTGAATA 


CTGCTACACC 


AGTTGGATTT 


GGATTGGCCT 


ATTTTATCGC 


GAAATTACTC 


7260 


AAAAAAAATA 


TCTATACTCA 


AGAGGAAATC 


GAAACATTGA 


AATCGGCTGT 


TCCTATGGGG 


7320 


ATTGTCAATA 


TTGTTGAAGG 


TGTAATTCCG 


ATTGTTATGA 


ATAACTTGGT 


TCCAGGTCTC 


7380 


ATTGCAACAG 


GTATCGGTGG 


TCCTGTTGCT 


GGTGCTGTTT 


CTTTGACAAT 


GGGTGCTGAT 


7440 


TCTGCTGTGC 


CATTTCGTGG 


AGTGCTTATG 


TTACCAACCA 


TCACTCGTCC 


AGTAGCTGGT 


7500 


ATTPGTGCCT 


TGTTAGCTAA 


CATTGTAGTC 


ACAGGACTTG 


TCTACGCGAT 


TTTGAAAAAA 


7560 


CCAATAAAAC 


ATGCAGAACC 


AGTTATGACT 


GTTGAAGAAG 


AGATTGATTT 


GTCAGATATT 


7620 


GAAATTTTGT 


AAGAGGGTAA 


CGATGTCAAG 


AATTGAATTT 


TCACCATCTT 


TGATGACCAT 


7680 


GGATTTGGAC 


AAATTCAAAG 


AGCAGATTAC 


TTTTTTGAAT 


GATAAAGTAG 


CATCTTATCA 


7740 


TATCGATATT 


ATGGATGGCC 


ATTTTGTTCC 


CAATATTACC 


TTGTCTCCTT 


GGTTCATTCA 


7800 


AGAAGTTCAA 


AAAATTAGTG 


ACACACCTTT 


ATCAGTTCAT 


CTGATGGTCA 


CAGACCCAAC 


7860 


CTTTTGGGTA 


GATCAAGTTC 


TCGATTTACA 


ATGTGAGTAT 


ATTTGTATTC 


ATGCTGAAGT 


7920 


TCTGAATGGT 


CTTGCTTTTC 


GTTTGATTGA 


TAAAATTCAT 


GATGCAGGTC 


TAAAGGCTGG 


7980 


TGTTGTCCTT 


AATCCTGAAA 


CACCTGTTTC 


TACAATCTTT 


CCCTACATTG 


ATTTACTTGA 


8040 
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CAAAGCAACT ATTATGACTG TAGATCCAGG TTTTGCAGGA CAACCCTTTT TGGAGTCTAC 8100 

CTTGTATAAA ATCCAAGAAC TCCGTCAGCT TAGAGTTCAG AATGGTTATC ACTACATCAT 8160 

TGAGATGGAT GGTTCTTCGA GTCGTAAGAC TTTCAAACAA ATTGATGTGG CAGGACCAGA 8220 

TATTTATCTT ATAGGTCGCA GTGGATTATT TGGTTTGGAT GACGATATTG CCAAAGCCTG 8280 

GGATATCTGT TCTAGAGATT ACGAAGAAAT GACCGGAAAA ACAATGCCAA TCAAATAATG 8340 

GTTTGAGAAG AAATTTATTA GTTAGGAGGA ATATATGTCA CTACAATCAG TTAACGCCAT 8400 

TCGTTTTCTT GGCGTAGATG CTATTAACAA ATCTAATTCT GGTCACCCGG GAATTGTCAT 8460 

GGGTGCTGCG CCAATGGCTT ATAGCCTATT TACAAAGCAC CTTAGAATTA CACCTGAGCA 8520 

GCCAAACTGG ATTAACCGAG ATCGCTTTAT CTTGTCTGCG GGTCATGGAT CAATGCTACT 8580 

GTATGCTCTC TTGCATTTAA CAGGGTATAA GGATGTATCC ATGGACGAGA TTAAAAATTT 8640 

CCGGCAATGG GGATCTAAGA CACCTGGTCA TCCTGAAGTG ACCCATACGT CTGGTGTGGA 8700 

TGCGACATCT GGTCCGCTTG GTCAGGGGAT TTCTACTGCC GTTGGTTTCG CCCAAGCAGA 8760 

GCGTTTTTTA GCTGCTAAGT ACAACAAAGA TGGTTTCCCT ATTTTTGACC ATTATACTTA 8820 

TGTTATCGCT GGAGACGGTG ACTTCATGGA AGGAGTGTCT GCGGAGGCGG CTTCTTATGC 8880 

AGGTCATCAA GCTTTAGATA AGCTTATCGT CCTCTACGAC TCCAACGACA TCTGCTTGGA 8940 

TGGTGAGACC AAAGATACTT TCTCTGAAAA TGTTCGCGTC CGTTACGATG CTTATGGTTG 9000 

GCATACAGTT CTGGTAGAAG ATGGAACAGA TTTAGCAGCA ATTTCTACAG CAATTGAGAC 9060 

GGCCAAGTTT TCTGGTAAAC CCAGTTTGAT TGAAGTGAAA ACGGTAATTG GTTACGGCTC 9120 

ACCCAATAAA AGTGGTACAA ATGCTGTTCA TGGTGCACCA CTAGGAGCAG AAGAAACAGG 9180 

AGCAACTCGT AAGTTTTTGG GATGGGATTA CGATCCATTT GAAGTACCAG AGGAAGTATA 9240 

TTCTGATTTC AAGACAAATG TAGCGGATCG TGGTCAGGAG GCATACGATG CTTGGGCTAG 9300 

TTTGGTGTCT GATTACAAGG TTGCTTATCC CGAAGTTGCT AGTGAGATTG ACGCTATTGT 9360 

AGCTGGAAAA TCCCCTGTAA CCATTACTGA AAAAGACTTC CCTGTCTATG AGAATGGCTT 9420 

CTCTCAAGCA ACTCGTAATT CGTCCCAAGA TGCTATTAAT ACAGCAGCAG TTTTACCAAC 9480 

CTTCTTAGGT GGATCGGCAG ACTTAGCTCA CTCTAACATG ACCTACATCA AGGCAGATGG 9540 

CTTACAAGAT AAATATAATC CATTAAACCG CAATATTCAG TTTGGGGTAC GTGAATTTGC 9600 

CATGGGAACA ATCCTCAATG GAATGGCTCT TCATGGTGGT TTACGAGTTT ATGGCGGAAC 9660 

CTTCTTTGTT TTCTCTGACT ACGTCAAAGC TGCTATTCGG CTATCAGCCA TTCAGGAGTT 9720 

GCCTGTAACT TATGTCTTTA CCCATGATTC AATTGCCGTT GGTGAAGATG GTCCAACTCA 9780 

TGAACCAGTT GAACATTTGG CAGGTTTACG CTCAATGCCA AACTTGACTG TTATCCGTCC 9840 
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AGCGGATGCC CGTGAAACTC AAGCGGCTTG GCATCATGCC TTGACCAGTA CCACCACTCC 9900 

AACTGTCATT GTCTTAACCC GTCAAAACTT GGTAGTTGAA GAAGGGACAG ACTTTGGTAA 9960 

GGTCGCTAAA GGAGCCTACC TCGTGTATGA TACCCCGGGA TTTGATACTA TTATCATTGC 10020 

TACAGGATCT GAGGTCAATC TAGCTATCAA AGCTGCTAAG GAATTGGTTT TACAAGGTGG 10080 

TAAAGTACGT GTGGTATCTA TGCCCTCAAC CGAACTATTT GATGCTCAAG ATGCTACCTA 10140 

CAAGGAAGAC ATTTTACCAT CTAAGACTCG TCGTCGTGTG GCCATTGAAA TGGCAGCGAC 10200 

CCAAAGTTGG TACAAGTATG TTGGTTTGGA TGGCGCGGTC ATCGGTATTG ACATCTTCGG 10260 

TGCGTCTGCC CCAGCTCAGA CTGTGATTGA TAATTATGGA TTTACGGTAG AGAATATCGT 10320 

TGCTCAAGTT AAGTCCCTAT AGAAACCAAT TACAATGAAG ATACAGCTGT TGTCAGACTA 10380 

GCAGATGTAG TGATAGACAC TAATCAGATG ATTGGTTATT TAAAAACTGT AATGAAAATG 10440 

TAATAATTTA TCTACGAAAG TTATAGTAGA TAGTATACAC AATAGAGTAT ACCCTGAAAC 10500 

GGTTGCGAAG TACGCTAATC ACTTTGCTAC TGATCTAGAT AGTTTCTTTA ATCAATAAAC 10560 

ACAGCATCCA CAGATTGACT TAGGATATTG TAAGTTTTTT GAAAGCTAGA GAGAAGGTCT 10620 

CTAAAATTAA AAAACGCATA GTATAGGATG TTGAAATGAT GAACTGCACC CCAAAAGTTA 10680 

GACAGAAAAA AATCTAACTT TTGGGGTGTT TTTATTATGA AATTAACTTA TGATGATAAA 10740 

GTTCAGTTCT ATGAACTTAG AAAACAAGGA TATATCTTAG AGAAGCTTTC AAATAAATTT 10800 

GGGATAAATA ATTCTAATCT TAGCTACATG ATTAAATTGA TTGATCGTTA CGGAATAGAG 10860 

TTCGTCAAAA AAGGGAAAAA TCGTTACTAT TCTCCTGATT TAAAACAAGA AATGATTCAT 10920 

AAAGTCTGAC ATGAAGCCTG GACTAAAGAT AGAGTTTCTC TTGAATACGG TCTCCCAAGT 10980 

CGTACGATAC TTCTTAACTG GCTAGCACAA TACAGGAAAA ACGGGTATAC TATTGTTGAG 11040 

AAAACAAAAG GGAGAGTACC TGAGAGCGGA GAATGCCATC CTAAAAAAGT TAAGAGAACT 11100 

CCGATTGAAG GAGGAAAAAG AGAAATAAGA AAGACAGAAA TTGTTCAAGA ATTAATGACT 11160 

GAGTTTTCGT TAGATCTTCT TCTAAAAGCC ATTAAACTAG CTCGTTGGAC CTACTACTAT 11220 

CACTTGAAAC AGCTAGATAA ACCAGATAAG GACCAAGAGC TTAAAGCTGA AATTCAATCC 11280 

ATCTTTATCG AACACAAGGG AGATTATGCT TATCGCCGGG TTCATTTAGA ACTAAGAAAT 11340 

CGTGCTTATC TGGTAAATCA TAAAAGAGTT CAAGGCTTGA TGAAAGTACT CAATTTACAA 11400 

GCTAGAATGC GACAGnAACG AAAATATTCT TCTCATAAAG GAG 1144 3 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 5333 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CCAATTACAT TATATTATCA AAATCGTCGA AACTGGCTCC ATGAATGAGG CAGCCAAGCA 60 

ACTCTTTATC ACTCAGCCAA GTCTCTCCAA TGCAGTGCGA GATTTGGAAA ATGAAATGGG 120 

CATTGAGATC TTTATCCCCA ATCCCAAGGG AATCACCTTG ACCCGTGATG GCATGGAGTT 180 

TCTCTCTTAT GCCCGTCAGG TTGTCGAGCA GACCCAGCTT CTGGAGGAAC GCTATAAAAA 240 

TCCTGTCGCC CACCGCGAAC TCTTTAGCGT TTCGTCTCAA CACTATGCCT TTGTGGTCAA 300 

TGCCTTTGTC TCTTTGCTCA AGAAAAGCGA TATCGAGAAA TACGAACTCT TCCTTCGTGA 360 

AACTCGGACT TGGGAGATTA TCGACGACGT CAAGAACTTC CGCAGTGAGG TCGGGGTCCT 420 

CTTCTTAAAC AGTTACAACC GTGATGTTTT AACCAAGATG CTGGATGACA ATCACCTGCT 480 

AGCCCACCAT CTCTTCACAG CGCAACCGCA TATCTTTGTC AGCAAGACCA ACCCTCTCGC 540 

AAAGAAAGAC AAGGTGAAAC TGTCTGATTT GGAGAATTTC CCTTACCTCA GCTATGACCA 600 

AGGGACGCAC AACTCCTTCT ACTTTTCAGA AGAGATTCTT TCTCAAGAAC ACCACAAGAA 660 

ATCCATTGTG GTCAGTGACC GTGCCACCCT CTTTAATCTC TTGATTGGTT TGGATGGTTA 720 

TACCATTGCG ACAGGGATTT TGAACAGCAA CCTAAACGGA GACAATATCG TTTCTATCCC 7 80 

ACTGGATATT GATGACCCGA TCGAGCTGGT CTATATCCAG CATGAGAAAA CCAGCCTATC 940 

TAAGATGGGC GAACGCTTTA TAGACTATCT CCTAGAAGAA GTTCAGTTTG ATAGTTGAGA 900 

AATGATAAGA ACCAATATGT AGGCTAGCAA CAACCTGCAC ATTCGTTCTT TTTACTTATA 960 

ATTAAAAGTT TCCCCTGCCA ACTTATCAGC TAGCTTGGGA AAGAGAGTAT AAAACTTATG 1020 

GGCTAGGTTC AACAAAATCG GGAGATTGAG TTCTCGTTTG TTTTTTCCTA TAATCTTGAC 1080 

AATCTTTTTA GCCACTGCAT CTGGTTCTAG CAGGAAGCGA TCAACCGATT TAAGATAAGT 1140 

TCCATCTGGG TCGGCTTGGT CGAAAAATCC TGTACGGATT GGTCCTGGAT TGACTGTTGT 1200 

CACATAGACT CCATAGGGCA TAAGTTCGAG TCGCAGAGCA TTTGAAAAAC CAATAGCCGC 1260 

AAACTTGGTC GCTGAGTAAA GACTAGACTT GCCAGTAGCT ATTAGACCTG CCATGCTGAC 1320 

GATGTTGATG ATATGCCCTT TGCTGCTTTC CTTCATACGA GCCGCAAGGT GACGAGACAG 13 B0 

ATTCATCAGG GCAAAGGTAT TGACCTCAAA CATCTGGTGA ATATCTTTAT CAGCAATCTG 1440 

GTCAAATCCC TCAAAAATCC CGTAACCAGC GTTGTTAATC AAGACATCAA TCTTGCCATA 1500 

GCGGAGATAA AGATCAGTTA CCAGAGCTTC TAGGGCTGAA TCGTCGGTAA TATCAATTTC 1560 
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AATCAATTCT 


GCATGGGAAT 


AATTTCCGTA 


GAGTTGGGCT 


AATTTTTCCT 


TATTTCTACC 


1620 


AAGCAAGATG 


AGTTGGTCAT 


TGGGCAGGAG 


TTTGACCATT 


TCTTGAGCTA 


GACCACCGCT 


1680 


AGCTCCGGTA 


ATGAGAATAG 


TACGCATACT 


TATCCTTTCT 


GTGACTGCTA 


GATTTCCACT 


1740 


TCTTCCAAGT 


CTTTGACCAC 


ATGGACATTT 


TCAAAAATTG 


TGGCAGCGTC 


TTTCTTGAGT 


1800 


TTGCTAATAT 


CTTTTGAGAG 


GAAACGGGCA 


CTGATATGGT 


TGAGTAGGAG 


GCGTTTGGCA 


1860 


CCTGCTTCTA 


CCGCTACTTG 


TCCAGCTTCC 


ATATTAGTTG 


AGTGACCATG 


GTTACGAGCA 


1920 


ATTTTTTCAT 


CACCCTTGCC 


ATAAGTGGAC 


TCATCAACTA 


GGACATCTGC 


ATTGACAGCC 


1980 


AGACGCACAC 


TGGCACCCGT 


TTTTCGAGTG 


TCTCCTAAAA 


TAGTGATAAT 


CTTACCTCGA 


2040 


CGTGGCGCTC 


AGATATAGTC 


TGCTGCCTTG 


ATTTCAGTTC 


CGTCTTCCAA 


AACAAGATCC 


2100 


TGGCCGTTTT 


TGATTTTACC 


AAAAAGCGCG 


CCGAACGGAA 


CACCAGCAGC 


CTTGAGTTTT 


2160 


TCAGCATCCA 


GCGTCCCTTC 


TAGATCCTTT 


TGCATGACAC 


GATAGCCAAC 


ACAGAAAATA 


2220 


GTGTGGTCCA 


GCTCCTCTGC 


ATACACAGTG 


AATTTATCGG 


TTTCAAGAAT 


TTTACCCAGA 


2280 


GAATCTTGGT 


CAAACTCATG 


GAAATGAATG 


CGGTAGGGCA 


GACCAGAACC 


TGACACACGA 


2340 


AGGCTGGTTA 


AGACAAATGA 


CTTGATTCCT 


TGAGGTCCGT 


AGATTTCCAA 


ATCTGTCTGC 


2400 


TCTTCATTGG 


CCTGAAAGGC 


ACGGCTAGAA 


AGGAAACCTG 


GCAAACCAAA 


AATGTGGTCT 


2460 


CCATGCAGAT 


GGGTAATAAA 


GATTTTGCTG 


ACCTTACGTG 


GTCGAATTCT 


GGTTTCCAGA 


2520 


ATGCGATTTT 


GCGTACCTTC 


TCCACAGTCA 


AAGAGCCAAA 


CTTCGTTAAT 


CTCATCCAAA 


2530 


AGTTTCAGGG 


CGAGACTTGA 


AACGTTGCGG 


GCTTTAGAGG 


CCTGACCAGC 


CCCCGTTCCT 


2640 


AAAAATTGAA 


TA7CCATTCG 


ATACTTTCTA 


ATTAATCAAT 


ATATAACATG 


GCTGTGCCGT 


2700 


TTTCCGATCG 


GAAATAGCGT 


TTGCCAGAAA 


AACCACCAGC 


TTCTTGCAAT 


AAATCCTCTT 


2760 


GGCTGTAGCC 


TTTGAGACGT 


TTTCGACCAT 


CAGCCAATCT 


TTCCAAATCA 


GTCAAAGCTC 


2820 


TGAGACTTTC 


TAGGCTGATA 


ACTTCCTCCT 


CCTCGACAGC 


CTTCATGTAA 


ATCTTACCAG 


2880 


ACTCTTCAAA 


GACTAATTGA 


TGGGGGAAAA 


TTTGCGCAAT 


TTCAAAGAGC 


AAGTCATCCG 


2940 


AGATTTTCTC 


CTCATTTTCA 


AAGAAAATCC 


GACCAAGGCC 


GTCACTCTCA 


TAACAAAAAC 


3000 


CAAAGGATTT 


ACCAGACAGA 


TTAAGCCGAA 


TAAAAGGCTT 


ATTTTCTAGG 


GTGAAACTTG 


3060 


GCTCAGTATT 


GTAAAGATTC 


AGTTCCTGAC 


TGAGTTCTGC 


AAAATAATCC 


GTCGCAGCCT 


3120 


GAGGACTCTT 


TTTCTGATAG 


AGTTCTGCAA 


AGTAGGCATT 


AACAACACTT 


GGCGCAGGTG 


3180 


TAATAAGTGT 


TAACTCCTCC 


TGATCTGTTT 


TACCAGCTAG 


AAGCTGATCC 


AGATAGACCT 


3240 


TGTCCAGACT 


TGTATAACCT 


CCATACTTTA 


GAGCCAAAGT 


TTTAATATCA 


GTCATAAAAT 


3300 
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TCTTCTAACC 


TCCATTTATT 


TTTCTCGGAA 


ATGTAGCCTG 


TAATc;...-rrc 


GCCGTCTTCC 


3360 


TGATAATCAC 


GTTCTTCCAG 


AATTGCAACA 


CTCTCTAAAT 


CATGAATCTT 


GTAGGACTTT 


3420 


GAAAAAGGCA 


CTCGCAGGGT 


AAATGCTTCA 


AAAATTTCCT 


TAATCTTATC 


TAGCAATAAT 


3480 


GCTTGCAAGT 


TTTCACGACT 


GTCCTCAGAC 


TTGGCAGAAA 


TGAGGCTATA 


TGGCGTTTGG 


3540 


GTAGGCGTGA 


AATCCTCCAC 


CAAATCCGCT 


TTATTATAAA 


GCGTCAAGTG 


AGGAATATCT 


3600 


TCCATGTCCA 


GGTCTTTCAT 


GATGGAGAGA 


ACCGTTTTTT 


CATGCTCCTC 


GTGGTAAGGA 


3660 


TTGCTAGCAT 


CGATAACATG 


AACCAGAAGG 


TCCACATGCT 


TGCTTTCTTC 


CAAGGTTGAC 


3720 


TTGAAACTGG 


ACACCAACTC 


TGTCGGCAAA 


TCTTGGATAA 


AGCCAACGGT 


ATCTGTCAAA 


3780 


GTTACTTGGA 


GATTGCCTCC 


CAGATGAATA 


CTCTTGGTTG 


TCGCATCCAG 


AGTCGCAAAG 


3840 


AGCTCATCTG 


CTTCATACTG 


GGTCTTACTG 


GTCAAGATGT 


TCATGATAGT 


TGATTTCCCA 


3900 


GCATTAGTAT 


AACCAATCAA 


ACCAATCTTA 


AAAGTGCTAG 


ACTCCAAACG 


TTTTTCTCTG 


3960 


ACAGTCGCAC 


GATTTTTCTC 

4 4-4*44 ^» 4 ^m* 


AACCACCTTG 


AGCTGGCGCT 


CGATATCCGT 


GATTTGATTG 


4020 


CGAACGCTAC 


GACGGTTCAG 


CTCCAGCTGG 


CTTTCACCAG 


GACCACGGGA 


ACCAATTCCC 


4080 


CCTaCCTGAC 

WW -4 V£W^» * \J#»^^ 


GGCTGAGCAT 


AATCCCCTGA 


CCAACCAAGC 


GAGGCAAAAG 


GTATTTGAGT 


4140 




GGACTTGGAG 


CTTCCCTTCA 


TGGCTTCGAG 


CCCGCATGGC 


AAAGATATCC 


4200 


AAAATCAACT 


GCATACGGTC 


AATGACCTTA 


ACACCGAGAA 


CTTCCTCTAG 


ATTGACATTC 


4260 


TGCCTTGGGG 


TCAGACGATT 


GTTGACGATG 


ACAGTAGTGA 


TTTCTTCTGC 


ATCCACCATA 


4320 


AGCGCAATCT 


CTTCCAACTT 


ACCAGAGCCG 


ACGAAGGTCT 


TGGAATCATA 


TTTTTCACGT 


4380 


TTTTGTCTGT 


AGCTATCTAC 


AACCACTGCC 


CCTGCCGTTT 


TCGCTAAACT 


AGCCAATTCT 


4440 


TCCATGGAGA 


GGTCAAAACT 


GTCCATACCC 


TGCAATTCCA 


CACCAATCAG 


CAGGACTCGC 


4 500 


TCCTCTTTTT 

* V* \p 4 ^m> 4 4 * 4 * 


TCTCCGTTTC 


AATCATCTAA 


AAACTCCTCT 


ATCTGGCTTA 


AAATGCGGTC 


4S60 


TTGTACACCA 


GATTCTCCAA 


TCTGATAAAA 


GGTGACCTGC 


ATGCGATTAC 


GGAACCAGGT 


4620 


CAGCTGACGC 


TTGGCAAAAC 


GACGAGTCGC 


CTGTTTAAGA 


CTCTCACTAG 


CI IX. L i v. \_ A/\ 


a nan 


GGTCTGCTCT 


CCACGGAAAT 


AAGGAAAGAG 


TTCCTTATAG 


CCAATTCCTT 


TAGCAGCCTG 


4740 


TACATTAGGG 


GAATGGTCAA 


ACAGCCACTT 


GGCCTCATCC 


AAAAGCCCAG 


CCTCAAACAT 


4800 


CAAATCCACT 


CGGTGGTTGA 


TACGCTCATA 


AAGTTGACTA 


CGTTCATCAT 


CCAAGCAGAT 


4860 


AATCAGCGGT 


TCATACAAGG 


TCTCTTGATT 


TTCCAAATCC 


TGACCAAAAT 


GGGCAATTTC 


4920 


TAAGGCACGC 


ATAGCACGAC 


GACGATTAAA 


CTGGGGAATC 


TCAAGGCCTG 


CTTGATCCAC 


4980 


CAAATGGGCT 


AATTCCTCAT 


CTGAATATGG 


CTCCAAACTA 


GCTCGATAAG 


CTAAAATCTC 


5040 


CTCATGAGGA 


GTCTCCCCAC 


CTAGGTGGTA 


ACCTTCTAGC 


AAGCTCTGGA 


TATAAAGTCC 


5100 
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AGTCCCACCG GCGATAATGG CTAGCTTGCC ACGGTTGTGA ATACCCTCAA TACTCATCTT 5160 

AGCTTCTGAA ACAAAATCAA AAGCCGAGTA AGACTCGGTT ATCTCTCTAA CATCGATTAA 52 20 

ATGATGAGGA ACAGCTGCCT GCTCTTCTGG ACTAGCCTTG GCCGTCCCAA TATCAAGTCC 5230 

TCGATAGACT TGCTGGCTAT CTCCACTAAC CACTTCGCCA TTAAAACGCT TTGCGGGG 5338 



(2) INFORMATION FOR SEQ 10 NO: 51: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19446 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGAAACCCA TCTAGTCTCC ATCGTTTGGG AGACCAAGCA ACACGAATCT TAGATGCTTC 60 

TCGCCAACAG ATTGCAGATT TAATCGGTAA GAAAAGCGAT GAAATCTTCT TTACCTCGGG 120 

TGGAACAGAA GGGGATAACT GGCTTATCAA GGGTGTGGCC TTTGAAAAAG CTCAGTTTGG 180 

CAAGCACATC ATTGTTTCAG CCATTGAACA TCCAGCAGTC AAAGAGTCAG CCCTCTGGTT 240 

GAAAAGTCAA GGATTTGAAG TGGATTTTGC TCCAGTTGAT AAGAAAGGCT TGGTCGATGT 300 

TGAGGCGTTA CAGGTTTGAT ACGGCATGAT ACAATCCTCG TTTCCATCAT GGCTGTGAAC 3 60 

AATGAAATCG GCTCTATCCA ACCTATTGAG CCTATTTCAG AATTCTTGGC AGACAAGCCG 4 20 

ACTATTTCCT TCCACGTTGA TGCGGTTCAG GCGCTTCCCA AAATTCCCAC TGAAAAGTAT 4 80 

CTGACAGAAC GGGTGGATTG CGCGACTTTC TCTAGTCACA ACTTCCACGG GGTTCGAGGT 540 

GTTGGCTTTG TCTATATCAA ATCTGGCAAG AAGATTACAC CTCTTCTTAC AGGTGGTGCC 600 

CAGGAGCGAG ATTATCGTTC GACAACTGAA AATGTGGCAG GGATTGCAGC GACAGCCAAG 660 

GCCCTCCGTT TGTCTATGCA AAAGCTAGAT ATCTTTAGGA GCAAGACTGG GCAGATGAAG 720 

GCAGTGATTC GCCAAGCTCT TCTGAACTAT CCGGATATTT TTGTCTTTTC AGATGAGGAA 780 

AACTTTGCAC CTCATATTCT GACTTTTGGA ATCAAAGGTG TTCGAGGTGA AGTCATCGTT 840 

CACGCCTTTG AAGACTATGA TATTTTCATC TCAACAACCT CAGCTTGTTC ATCTAAGGCA 900 

GGAAAACCAG CCGGTACCTT GATTGCCATG GGAGTGGACA AAGATAAGGC CAAGTCAGCT 960 

GTGCGTCTTA GCCTAGACTT GGAAAATGAT ATGAGTCAGG TCGAGCAGTT TTTGACCAAG 1020 

TTAAAATTGA TTTACAATCA AACTAGAAAA GTAAGATAGG AGCATTCATG CAGTATTCAG 1080 

AAATTATGAT TCGCTACGGA GAGTTGTCAA CCAAGGGTAA AAACCGTATG CGTTTCATCA 1140 
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ATAAACTTCG TAATAATATT TCGGACGTTT TGTCTATCTA TACCCAAGTT AAGGTAACAG 1200 

CAGATCGCGA CCGTGCCCAC GCTTACCTCA ATGGAGCTGA TTACACAGCA GTTGCAGAAT 1260 

CTCTCAAACA AGTTTTTGGA ATTCAAAACT TTTCTCCTGT TTATAAGGTT GAAAAATCTG 1320 

TAGAAGTTTT GAAGTCTTCT GTCCAAGAGA TTATGCGGGA CATCTACAAG GAAGGTATGA 1380 

CCTTTAAGAT TTCTAGCAAG CGTAGCGACC ACAACTTTGA ACTTGATAGT CGTGAACTCA 1440 

ACCAAACACT TGGAGGGGCT GTATTCGAAG CCATTCCAAA TGTGCAAGTT CAAATGAAAA 1500 

GTCCTGACAT CAATCTTCAG GTGGAGATTC GTGAAGAAGC AGCCTATCTT TCTTATGAAA 1560 

CCATTCGTGG GGCTGGTGGT TTGCCAGTTG GAACTTCAGG TAAAGGGATG CTCATGTTGT 1620 

CAGGAGGGAT TGACTCACCT GTAGCAGGTT ATCTTGCTCT TAAGCGTGGG GTGGATATCG 1680 

AGGCAGTTCA CTTTGCTAGT CCACCATATA CTAGTCCTGG TGCCCTCAAG AAAGCGCAGG 1740 

ACTTGACCCG TAAATTGACC AACTTTGGCG GAAATATCCA GTTTATACAG GTGCCTTTCA 1800 

CAGAGATTCA AGAGCAAATC AAAGCCAAAG CGCCAGAAGC TTATTTCATG ACTCTAACTC 1860 

GTCGCTTTAT GATGCGGATT ACTGACCGTA TTCGTGAGGT ACGAAATCGT TTGGTTATCA 1920 

TCAATGGGGA AAGTCTAGGT CAAGTAGCCA GCCAAACCCT TGAAAGTATG AAGGCTATCA 1980 

ATGCTGTTAC CAACACTCCC ATCATTCGTC CTGTGGTTAC CATGGACAAG TTGGAAATCA 2040 

TTGACATCGC CCAGGAAATC GATACCTTTG ACATTTCAAT CCAACCGTTT GAAGACTGTT 2100 

GTACCATTTT TGCACCAGAT CGTCCAAAAA CAAATCCTAA AATTAAGAAT GCGGAGCAGT 2160 

ACGAAGCGCG TATGGATGTT GAAGGCTTGG TTGACCGACC AGTGGCTGGA ATCATCATTA 22 20 

CTGAAATCAC ACCTCAACCC GAAAAAGATG AAGTTGATGA CTTGATTGAC AATCTGCTCT 22 80 

AATTCAGAAA ATCCAAAAGA ATAGCGAAAA TCAGTAAAAA AAGTTAGTTT TTTCTCTAAA 23 40 

AACAGGTAAA AAACTAACTT TTTTTATTTT TATGATATAA TGATATAAAA TTTTGAATAT 2400 

AGAGAGTTTT CTGACAATGA ATCAATCCTA CTTTTATCTA AAAATGAAAG AACACAAACT 24 60 

CAAGGTTCCT TATACAGGTA AGGAGCGCCG TGTACGTATT CTTCTTCCTA AAGATTATGA 2520 

GAAAGATACA GACCGTTCCT ATCCTGTTGT ATACTTTCAT GACGGGCAAA ATGTTTTTAA 2580 

TAGCAAAGAG TCTTTCATTG GACATTCATG GAAGATTATC CCAGCTATCA AACGAAATCC 2 640 

GGATATCAGT CGCATGATTG TCGTTGCTAT TGACAATGAT GGTATGGGGC GGATGAATGA 2700 

GTATGCGGCT TGGAAGTTCC AAGAATCTCC TATCCCAGGG CAGCACTTTG GTGGTAAGGG 2760 

TGTGGAGTAT GCTGACTTTG TCATGGAGGT GGTCAAGCCT TTTATCGATG AGACCTATCG 2820 

TACAAAAGCA GACTGCCAGC ATACGGCTAT GATTGGTTCC TCACTAGGAG GCAATATTAC 2880 

CCAGTTTATC GGTTTGGAAT ACCAAGACCA AATTGGTTGC TTGGGCGTTT TTTCATCTGC 2940 
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AAACTGGCTC CACCAAGAAG CCTTTAACCG CTATTTCGAG TGCCAGAAAC TATCGCCTGA 3000 

CCAGCGCATC TTCATCTATG TAGGAACAGA AGAAGCAGAT GATACAGACA AGACCTTGAT 3060 

GGATGGCAAT ATCAAACAAG CCTATATCGA CTCGTCGCTT TGCTATTACC ATGATTTGAT 3120 

AGCAGGGGGA GTACATCTGG ATAATCTTGT GCTAAAAGTT CAGTCTGGTG CCATCCATAG 3180 

TGAAATCCCT TGGTCAGAAA ATCTACCAGA TTGTCTCAGA TTTTTTGCAG AAAAATGGTA 3240 

AGTTAAGAAA GGAAAAAACG AAATGCATAT TGAACATCTT AGCCACTGGA GTGGTCATCT 3300 

TAACCGTGAA ATGTACCTTA ACCGTTATGG ACATGGTGGG ATTCCAGTTG TGGTCTTTGC 3 3 60 

TTCATCAGGT GGTAGTCACA ACGAATACTA TGATTTTGGC ATGATTGATG CCTGTGCTTC 3420 

CTTTATCGAG GAAGGCCTTG TCCAGTTCTT TACCCTATCT AGTTTGGATA GTGAGAGCTG 3480 

GTTGGCTACT TGGAAAAATG CTCATGACCA AGCGGAAATC CACCGTGCCT ACGAACGTTA 3540 

TGTGATTGAG GAGGCCATTC TTTTATCAAG CACAAGACAG GTTGGTTTGA TGGCATGATG 3600 

ACGACAGGTT GCTCTATGGG AGCCTATCAT GCACTCAATT TCTTCCTCCA GCATCCAGAT 3660 

GTCTTTACCA AAGTGATTGC TCTCAGTCGT GTTTACGACG CACGTTTCTT TGTCGGTCAT 3720 

TACTACAACG ATGATGCTAT TTACCAAAAC TCGCCAGTAG ATTATATTTG GAACCAAAAC 3780 

GACGGCTCCT TTATTGACCG TTACCGTCAG GCAGAGATTG TGCTGTGTAC GGGCCTTGGA 3840 

GCCTGGGAAC AAGATGGTTT GCCATCCTTT TACAAGCTCA AAGAAGCCTT TGACAAGAAA 3900 

CAAATTCCAG CCTGGTTTGC TCAATGGGGA CATGATGTCG CCCATGACTG GGAATGGTGC 3960 

CGTAAACAAA TCCCTTATTT CCTCGGTAAT CTCTATTTAT AAAAGGAGTT ACCTATGAAT 4020 

TACCTTGTTA TTTCTCCCTA CTATCCACAA AACTTTCAAC AGTTTACCAT CGAACTAGCT 4080 

AATAAAGGCA TCACAGTCTT GGGAATTGGT CAACAGTCTT ACGAGCAATT GGATGAGCCC 4140 

TTGCGCAATA GCTTGACCGA GTATTTTCGT GTTCATAATC TTGAGAACAT AGATGAAGTC 4200 

AAACGTGCAG TTGCTTTTCT CTTTTATAAA CATGGTCCAA TTGGCCGCAT CGAGTCTC\C 42 60 

AATGAATACT GGCTTGAGCT AGACGCAACA CTCAGAGAAC AATTCAATGT TTTTGGTGCC 4 3 20 

AAACCAGAGG ATCTCAAAAA GACGAAATAT AAGTCTGAAA TGAAGAAACT TTTCAAAAAA 4380 

GCAGGTGTTC CTGTGGTACC TGGAGCTGTT ATCAAGACGG AAGCAGATCT TGATCAAGCA 4440 

GTGAAAGAAA TCGGTCTTCC AATGATTGCC AAACCTGATA ATGGAGTGGG AGCAGCCGCA 4500 

ACCTTTAAAC TTGAGACAGA AGACGATATC AATCACTTCA AGCAAGAATG GGACCATTCA 4560 

ACCCTTTATT TCTTTGAAAA ATTTGTCACT TCCAGCGAAA TCTCTACCTT TGACGGGCTC 4 620 

GTGGACAAGG ATGGAAAGAT TGTCTTCTCA ACAACCTTTG ACTACGCCTA TACACCGCTT 4680 
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GACCTCATGA TTTATAAGAT GGACAATTCT TATTATGTGC TCAAGGATAT GGATCCTAAA 4740 

CTGCGCAAGT ATGGGGAAGC AATTGTCAAA GAATTTGGTA TGAAAGAACG GTTTTTCCAT 4 800 

ATTGAGTTCT TCCGTGAGGG GGACGATTAT ATTACCATCG AGTACAATAA CCGCCCTGCA 4860 

GGTGGTTTTA CCATTGATGT TTATAACTTT GCTCATTCCT TGGACCTTTA TCGTGGCTAT 4 920 

GCAGCTATTG TCGCAGGAGA GGAGTTCCCG GCGTCAGACT TTGAAACTCA GTATTGTTTG 4 980 

GCTACTTCTC GCCGTGCAAA TGCTCACTAT GTTTATTCAG AAGAGGATTT GCTTGCCAAA 5040 

TATAGCCAGC AGTTCAAGGT TAAAAAAGTC ATGCCAGCTG CCTTCGCGGA ACTTCAAGGA 5100 

GATTACCTGT ATATGCTGAC CACTCCGAGT CGACAAGAAA TGGAGCAGAT GATTGCAGAT 5160 

TTCGGACAAC GTCAAGAATA AGAACTATCG GATTAAGGAA ATTAACTCCC TTAATCCTTT 5220 

TGTTTTGTCT GATAAAAAAT AAGAGCATCC CAACAAGGTA GCTATCATAA AACTTGTTCG 5280 

ATAACTATTT GAAGCAGGAT TAGGTGGTCA GAAATTAAAT TTTAATATTT CAATTGAGTC 5340 

ATAGTATTGT GTTTGCGTAT CCTTAAATCA GCTAAAAGGA TCCATGACGA CACCTATACG 5400 

ATATAGTTTT CAAGATACCA AACAAGTCTA TTAATATTCA ATGAAAATCA AAGAGCAAAC 5460 

TACGAAGCTA GCCGCAGGTT TCTCAAAACA CTGTTTTGAG CTTCTCGATA GAACTGACAC 5520 

AGTCAGTATC ATATACTACG GC AAGGTGAA GCTGACGTGG TTTGAAGAGA TTTTCGAAGA 5580 

GTATAAAATA TTCAGGTGAC GCATAGATAT AGTTAATTGA AGCTTTGTTT GAAATCTGAT 5640 

AAAATAATGA TATTACTAAG TTTTAAAAAC TAAAGAAAAG GGAAGATATG ATTACAGGCG 5700 

AATTAAAAAA TAAAATCGAT CAGCTGTGGG AAATTCTTTG CACAGAACCA AACGCAAATC 5760 

CTTTAACAAA TATTGAACAG TTGACTTATC TCTTATTTAT GAAAGATTTC GATAGTGTCG 5820 

AGCTTGGACG TGAAAGTGAT GCTGAATTTC TAGCGATTCC TTATGAGGGA GTTTTTCCAA 5880 

AAGATAAACC TGAATACCGT TGGTCAACTT TTAAAAATAT AGGAGATGCT CAGGAACTTT 5940 

ATCGTTTAAT GACTCAGGAG ATTTTTCCGT TTATTAAAAA TCTCAAGGGG GATACAGATG 6000 

ATACAGCCTT TTCACGATAT ATGCGAGAAG CTATTTTTCA AATAAATAAA CCTGCTACGC 6060 

TTCAAAAGGC AATTTCTATC TTAGATGTTT TTCCAACTAG GGGATTAGAT GTAGATTTTG 6120 

ATAATGACAA ACAAAGTATT ACTGATATCG GAGATATCTA TGAATATCTG TTATCAAAAT 6180 

TGTCGACCCC AGGTAAAAAT GGACAGTTCC GTACACCTCG TCACATCATC GATATGATGG 6240 

TTGAGTTGAT GCAACCGACT ATCAAAGATA TCATCTCAGA TCCCGCTATG GGTTCTGCTG 6300 

GCTTCTTAGT ATCTGCTAGC CGTTACTTAA AGCGTAAGAA AGATGAATGG GAAACCAATA 6360 

CAGATAATAT CAATCATTTT CATAATCAGA TGTTTCATGG AAATGATACG GATACGACTA 6420 

TGTTGAGACT TCGGGCGATG AACATGATGC TACATGGAGT AGAAAATCCA CAAATCAGTT 64 80 
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ACCTTGACTC GCTCTCTCAA GATAATGAAG AAGCCGATAA ATATACTTTG GTTTTAGCAA 6540 

ATCCTCCTTT TAAGGGCTCA CTTGACTACA ATTCAACCTC TAATGACCTT CTTGCAACCG 6600 

TAAAAACCAA AAAAACAGAA TTACTCTTTC TTTCTCTTTT CTTGCGAACT TTAAAACCAG 6660 

GTGGACGAGC AGCAGTTATC GTACCTGATG GTGTCCTTTT TGGTTCGTCT AAAGCTCATA 6720 

AAGGAATTCG TCAGGAAATT GTAGAGAATC ATAAGCTTGA TGCTGTAATC TCAATGCCTA 6780 

GTGGTGTGTT CAAGCCTTAT GCTGGAGTTT CAACTGCCAT TCTCATCTTT ACAAAAACTG 6840 

GTAATGGTGG TACTGACAAA GTCTGGTTTT ACGATATGAA AGCGGATGGT TTAAGTTTGG 6900 

ATGATAAGCG ACAACCGATT AGCGACAATC ATATTCCAGA TATTATCGAA CGCTTTCATC 6960 

ATCTTGAAAA AGAAGCAGAA CGTCAGAGAA CGGATCAATC TTTCTTTGTT CCAGTTGCTG 7020 

AGATAAAGGA AAATGATTAT GATTTGTCTA TCAATAAATA TAAAGAGATT GAGTATGAAA 7080 

AAGTTGAGTA TGAACCAACA GAAGTCATAT TAAAGAAAAT CAATGATTTA GAAAAAGAAA 7140 

TTCAAGCTGG CTTGGCTGAA TTGGAAAAAT TACTCAAGTA GGGAGGTGGC TGTATGAAAA 7200 

AAGTGAAGTT GGGGGAAGTC TTATCTCTAA AAAAAGGCAA GAAAGCCACT GTACTTGCTG 7260 

AACAAACAAC TCTAAGCCAA CGTTATATTC AAATAGATGA TTTAAGAAAT AATAATAATT 7320 

TAAAATTCAC TGAAAGTTTA AATATGACTG AAGCACTCCC AGATGATATT CTGATAGCAT 7380 

GGGATGGAGC TAATGCAGGA ACAGTTGCTT ATGGATTATC GGGAGCTGTT GGTAGTACAA 7440 

TTACGGTCTT AAAAAAGAAT GAGCGATACA AAGAAAAAA? TATATCAGAT TACTTGGGAG 7500 

TCTTTTTGGA AAGTAAATCG CAGTATTTAC GACATCATTC AACAGGTGCA ACAATTCCTC 7 560 

ATTTAAACAA GAATATATTA CTTGATTTAC AATTAGAATT GCTAGGTATC GAAGAACAAG 7620 

AGAACATTAT CTGTATTCTT AATACGA7TA AAAGGCTTAT TACTAAAAGA AAATTTCAGT 7 680 

TAGATGAACT AAACTTGCTC GTCAAATCCC GATTTAACGA GATGTTTGCG GAAAATAAAA 7740 

TATTTGAAAG CATTGATAAC TTATTTGATA TTATAGATGG TGATAGGGGC AAAAATTATC 7800 

CTAAATCAGA TGAGTTGTTT AGTGAGGAGT ACTGTTTATT TTTAAATACA AAGAATGTTA 7860 

CTAAAAACGG ATTTTCATTC GATACAAAGC AATTTATCAC TAAAACAAAG GATAAATTAC 7920 

TTCGAAAAGG CAAACTTGAG CGTTATGATA TAGTCTTGAC AACAAGAGGT ACTGTTGGAA 7980 

ATGTACCGTA CTACGATGAA TTAATAAAAT ATAAACATTT ACGTATAAAT TCAGGTATGG 8040 

TAATATTACG TCCCAAGACA CCAAATCTAA ATCAGAAATT TATTATCCAT GTTTTAAGGA 8100 

ATAATAATTA TAGTCGAGTG ATATCAGGAA GTGCTCAGCC TCAGTTACCA ATTACAAAAT 8160 

TAAAAAAAAT ACTTCTCCCC CTCCCCCCAC TAGCCCTCCA AAATGAGTTC GCAGACTTTG 8220 
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TAGTCCAGGT CGACAAATCA CAATTTGCTT GTGAGATAGC TATAAAAGTG TGGAGAAATA 8280 

GCTTGAAATT TAGTATAATA TAGCTAAACT ATTTGTTTAA AGTGAGAAAA AAATGGGAAA 8340 

TTTTAGCTTT CTTTTAAAAA ATCACGAATA TGAATCTTTT TCAAAACCTT GCATTGAAGC 8400 

TGAGAATATG ATTGCTACAT CAACTGTGGC TACTGCCTTT ATGGCGCGTC GTGCTTTAGA 8460 

GCAGGCTGTC CATTGGATAT ATAGTCACGA TTCATATTTA GAAGCTCCCT ATCGTGCTAC 8520 

TCTATCTTCT TTAGTATGGG ATGATGATTT TAGGGATAT C GTAGATTCTG AACTCCACAA 8580 

GCAGATAGTT CTGTTGATTC GGTGGGGAAA CCATGCTGCT CATGGTGGTC AAATTAAGGA 8640 

ACGAGAAGCG ATTTTAGCTT TGCATCATTT GTATCAGTTT GTTAATTTTA TCGATTATTG 8700 

TTACAGCAAT GAGTTTGTGG AGCGTTATTT TGATGAGAAG TGCTTACCAC TTTCAGCAAA 8760 

CATCAAATAC CGAGAAACTC CACAATCTAT GATAAAGTTA CAAGACAGTT TACCAGAACT 8820 

GCCTGATTTT CATGAACAGA TGGCTGCTCA GTCCGTAGAA GTTCAAGAGA CTTATACTGA 8880 

AAAACGTGAG ACTGCAGCGC AACGGCAAGA TCTGCCTTTC CATATTGATC AATTATCTGA 8940 

GGCAGAGACA AGAAAGCTCT TTATTGATAT CGATCTCCGT TTAGCAGCAT CGATATTTGA 9000 

AGAAAACTGT CGTGTTGAGA TAGCCGTTGA TGGTCTCAAG CACGGTTCAG CAATTGCTTA 9060 

CTGTGACTAT GTACTTTATG GTAAAAATGG GAAAATTTTA GCGATTGTGG AGGCTAAAAA 9120 

AGCCTCTGTC AATCCAGAAG TAGGGGAAGT ACAGGTCAAA GAATATGCTG AAGCTTTGGA 9180 

GAAACATATC GGCTATCAGC CAATTTGCTT TATTACAAAT GGGTTGAAGC ACTATATACT 9240 

TGATGGTCCG AACCGCCGCC AGATTGCAGG CTTTTACTCT CAAGAAGAAT TGCAATTAGT 9300 

GATGGATAGA CCTCATCTTC AAAAACCGCT TGAGGATATT TCTAGTAAAA TTAGGGACGA 9360 

TATTTCCGGG CGTCACTACC AAAAACATGC CATTGCAAGC GTTTGTGAAG CTTTCTCTGA 9420 

TCATCGTAGA CAGGCACTTT TGGTTATGGC AACTGGGGCG GGGAAAACTC GTACAGCAGT 94 BO 

TTCTCTAGTT GATATCTTAT CACGTCATAA CTGGGTAAAA AACCTTCTCT TCTTAGCCGA 9540 

TAGAACTTCC TTGGTTAAGC AAGCATATGA TTCGTTTAGA AAATTACTCC CAGATCTTTC 96 00 

CCTTTGTAAC TTCTTAGAAG ATAAAGAAGG AGCTCAATCA AGTCGCATGG TCTTTTCAAC 9660 

TTATCCGACC ATGATTGGAG CGATTAGTGG TCAAGAAGAA GTAAATCAAC GCCCTTTCAC 9720 

TGTTGGGCAT TTTGACCTTA TCATAATTGA CGAATCTCAC CGTTCTATTT ATCAGAAATA 9780 

CAAGTCCATT TTTGATTATT TTGATGCAAG AATTGTAGGC TTAACAGCTA CTCCGCGTCA 9840 

AGATTTAGAT AAAAACACCT ATGGATTCTT TAATTTGGAG AATGGGGTTC CAACATATGC 9900 

ATATGATTTG GAAGAGGCTG TTAAAGACGG ATATTTAGTA GCCTATCATT CTATCGAAAC 9960 

CAAACTGAAA CTACCTACGC ATGGTCTACA TTATGATGAT TTGTCCGAAG AAGAAAAGGA 10020 
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ACATTTTGAT 


AGCAAATTTG 


AAGACAATAG 


CTGTGAAAAA 


CATATTGATG 


GGAGTGTATT 


10080 


TAATTCCTTT 


ATTTTCAATA 


AAAGTACAGT 


AGAAATTGTT 


TTAAATGAAC 


TCATGACAAG 


10140 


AGGAATTCAG 


ACAGCCTCGG 


GTGATGAAAT 


TGGTAAAACT 


ATTATTTTTG 


CTAAAAATCA 


10200 


TGATCATGCG 


GAATATATCA 


GAGGTATTTT 


TAACAACCGC 


TATCCTGAAA 


AAGGGAGCGA 


10260 


CTATGCTCAG 


GTGATTGATT 


ATAGTATTAA 


CCATTATCAG 


ACCTTGATTG 


ATGATTTTAA 


10320 


AATTAAGGAG 


AAGTATCCTC 


AAATTGCGAT 


TTCTGTCGAT 


ATGTTAGATA 


CAGGTATTGA 


10380 


TGTACCAGAG 


GTTGTTAATT 


TAGTCTTCTT 


CAAGAAAGTA 


CGCTCTAAAA 


CTAAGTTTTG 


10440 


GCAGATGATT 


GGTCGAGGAA 


CCCGTCTATG 


TAAAGATTTA 


TTTGGACCTG 


AGCAGGATAA 


10500 


GGAAAACTTC 


TTGGTATTTG 


ATTATGGGGA 


CAATTTTGAT 


TATTTTCGTG 


CAGATCCAAG 


10560 


AGATGGAGAG 


GGTCGTCACA 


TTGTTTCGCT 


GACTCAGCGT 


TTATTTAATA 


TCAAAGTGGA 


10620 


CTTGATTCGA 


GAACTTCAGG 


GACTCCAATA 


CCAAGAAGAT 


CACTTTCCGA 


GAGCATACCG 


10680 


TCAGCAGCTT 


GTCTCGGAAC 


TTCAAGGTCG 


TATAGAGAGC 


TTAAATGAGT 


TGGACTTCAG 


10740 


GGTTCGTATG 


GTTTTAGATA 


CAGTTTATAG 


CTATAGGAAA 


TTGGAAAGTT 


GGCAGAATCT 


10800 


AACTGCTGTT 


ACAAGTGAAA 


CCATTCAAAA 


AAATCTCTCT 


CCGCTTTTAT 


TTGATGAAGA 


10860 


TAAAGAAGAT 


GAGATGGCGA 


GGAGATTTGA 


TTTGTGGTTG 


CTTCATATTC 


AGTTGGGGCA 


10920 


ACTGACAGCT 


AAATCTTCCA 


CTCTTCATAT 


TTCCCAAGTG 


ATGAAGACGG 


CTAGAGCTCT 


10980 


TTCTCCTATT 


GGCAATATCC 


CGCAGGTTTT 


TGAGCAGCCT 


GAAATTATCA 


GGAAAGTACA 


11040 


GCAGCCTGAA 


TTTTGGAAAG 


AAGTTAACTT 


GTCTGATTTC 


GAAAAAATTC 


CTCTTGCTAT 


111C0 


TCGAGATTTA 


TTACAGTTTT 


TGGATAAAAC 


AGACCGTAAA 


CCCTACTATG 


TTAACTTTGA 


11160 


AGATCGTATA 


CTCTCCACTG 


TTCACGAGAC 


CACAGCATTT 


TTGCAGGTCA 


ACGATCTTCG 


11220 


GTCTTACAAT 


GAAAAAGTTG 


AGCATTATTT 


GAAAACTCAT 


CTGGATGAGG 


AGTCCATTTC 


11280 


TAAGCTATAC 


CATAATAAAA 


AGTTGACATC 


TGATGATATG 


CTTGCACTTG 


AAAAATTGCT 


11340 


TTGGGAAAAA 


TTAGGTAGTA 


AAGCAGACTA 


CCAAAGTCAT 


TATGAAAATA 


AGGCAATTCC 


11400 


GAGATTGGTT 


CGTGAGATTA 


TTGGCTTAGA 


TAGAGAGTCT 


GCCAA TCGT A 


TTTTTTCTAA 


11460 


ATTTTTGTCG 


GATGAGAATC 


TTAATGCCAG 


GCAGATTTCA 


TTTGTAAAAT 


TGATTCTAGA 


11520 


CTACATTGTA 


GAAAATGGTT 


TTTTAGAGAC 


GAAAGTGTTA 


ACGCAAGAGC 


CCTTTAAATC 


11580 


TTATGGTTCT 


GTTCAACTAC 


TCTTCCAACA 


CCAACTACCA 


GTACTTCGTA 


ATATTGTTCA 


11640 


AATCATTGAA 


CTTATCAATA 


ATCGAGCTGG 


AGAAGCGGCT 


TAAATTCTAA 


AGTGATTGCC 


11700 


ATGCTGAGAC 


TCATTTAAAA 


TTAAAAAGAC 


TAGAAATTTA 


TGCTATATAT 


GAGAAGTTTT 


11760 



WO 98/18931 



PCT/US97/19588 



462 

ATTAGGAAGA ATGTCATCGT TTTCCTAGAA TACAGTATCA GTTGTTAAGT GGTTGATAAA 11820 

TTTCAAAGTA GATACTTGTA CCACGATGTT TGTTGATCGA GTTATTAACA AAAGAGCTAC 11680 

TTTGATTTTA AAGAAATAGA AAACAAAAAG CCGAGCAAGA ATTCAATTGC AGGAGAAAAT 11940 

GAAATAATAC TCAATGAAAA TCAAAGAGCA AACTAGGAAA CTAGCTGCAG GCTGCTCAAA 12000 

ACACTGTTTT GAGGTTGCAG ATGGAAGCTG ACGCGGATTG AXGAGATTTT CGAAGAGTAT 12060 

AAATCTTCCT AGGATAAAGC AAAACGCATA GTATCAAGGG TTTTCAACAC TTGATACTAT 12120 

GCGTTTTCTG ATGTTAAAGA CTTTCTACCA GGTTTTTTAA AAGCATAATT GTTAGTTGTA 12180 

GTCATTTATT ATTCTTCAAA GAAAAATGGT GGGGCGAATT TTTTCAGTTC TTCAAAGCAC 12240 

TTTTGAGCAG TATCTGCATC TTCACAGATG ATAAGACAGA CATC ATT ACC ACAAAGGGTA 12300 

CCGATAGCGT CAGGGAAGCT CAAAGTATCA ATGATAGAAC CAAAGGATTG AGCCAGTCCA 12360 

GGAAGGGTTT TTAGTAGGAC TTGGTGTTGA ACTGGGCGCA TCCAGACAAG GGCGTCTTCC 12420 

ATGTAGAGTT CGAGACGTTT TTCCCATTTT GAGATGGAAC CATTGTTAAG AACATAATAA 124 80 

GCGCTATCTT CTTCGCGGAC TTTTGATAGG TTCATATTTT TGATCTCGCG TGAGAGGGTT 12540 

GCCTGGGTTA CTTGAATGTC GTTCTCAGCA AGAAGGGCTT GCAACTCAGC CTGTGTATGA 12 600 

ATCTTGTTTT TTGTGATAAG AGCGCGTATA AGTTGGTGGC GGTGTTCTGA TTTATTCATA 12660 

ATAATGTAAC TCCTTTTAGC AAGGTAAGCT AAGCATGGAC TGAGCGAGGT CGACAGTCAA 12720 

GTGGTAGTCT GTATTGTCAC GGATGGTGAT TTCAAAGTCA GT ACT AT AG A GGACTAAACG 12780 

GAGAGTGTCT CCTTCTTTTA GCTTGTAAAT AGTTGGCTGC AGTTCAAATT GAACGTCCAT 12B40 

CCATTCATCT GCAGTAATAT CCTCTACTAA CAGTAAATCA TTTCTATTTT GTAAATTAAG 12900 

GTAACCTTTT GTCACGACTC GTTGTGCCTC TGGTCTAAAT GGCAATTCAC AGAGATTTTC 12960 

CAACATGTGA TAGCGACCGT TGTCAATGGT TCTAGCACTT AAAATAGCTG GATAAGGTTG 13020 

TAGGTATTTC TTTTGCCCAA ATTCTAGCAG TTGGGCAGAT AAGAGCCCCT TGTTTGTACT 13080 

GGATTTGATA CGAAGATTGA GCTGAGCGCG ACCGTTTAGG TGGAGATCTT TAGTCACAGG 13140 

AAGGTTAATA GTAATCTGAT TGGCTTTCCC TTGATAGAGC TCTGTATTGA AGGTTTGGTA 13200 

TGTCTTACCA TAGCGCTCAA AATCCTTATC TGGGTACTGG TTTTGAATAC CTTGCTCTTC 13260 

TTGACCAAGT GAGAAGGTTT CACAGTTTTC TTGCCCACCG AAGTTATCAA GTGATAACCA 13320 

AGTCTGTGGA GCTGTATTGT CCTGCCAGAT AACAGTAGGA AGTTGAAAGT CTGTTTCCTG 13380 

TCCTAGTAAT TTCTTGGTCA ATAAGGCATT TATGCACTCA CGGAAGTCAA TTGATTGCCA 13440 

ATTGTTCATG TAAACATGGG CACCATTATG GAAAAAGAGA TGCTTGTGTA TATGAGTAGG 13500 

AAGAGCATGG AACATCTGGT AAACATGAAG TGGTTTGACA TTCCAATCCT GAGAACCATG 13560 
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AGTAAAGACA ACCTCTGCCT TTACTTTATG GGCATTGAGC AGAT/VxTTGC GGTCATGCCA 13620 

AAACTGATTG TAGTCCCCAG TTTTTCGGTC TAGCTGAGCT TTCACTTTTT CTAAGTCAGC 13680 

TTGGTGAGCT TCATTGCCAC GGATATAGTC GCCAGCTAAG AGATTACGAG AATAGGTTAA 13740 

CTCAGCAAGG GAGTCAAAGT CCTCACCTGG ATAACCACCT GGGCTAGTCA CCAGACCGTT 13800 

TTCACGGTAG TAGTTGTACC ATGATGAAAT TCCTGCCTCG GCAATGATAA CTTCTAAACC 13860 

ATCGACTCCT GTAGTCGCAA GACCATTGGA CATGGTACCT AGATAGGAAA GTCCTGTTGT 13920 

AGCAACTTTT CCGTTTGACC AATCAGCCTT GACTTGACGC TGGCGCGTGT GATCAGTAAA 13980 

GGCACGGCAA CGACCGTTAA GCCAATCGAT GACATTTTTA TAAGCCTCGA TTTGCTGGTA 14040 

GTCTCCATTA GTCATGAAAC CTGTCGAGTC TTTGGTACCA ACACCTGAGA CATAGAGATT 14100 

GGCAAAGCCT CTCGGAAGGA AGTAGTCGTT TAGTGTATAG CTAGAGTTGA TGTGAGTTAG 14160 

CTTTTCCTCA GCCTCTGCTA TAAGCTCAGC TTTACCTTGG GGTTGGACGA GATTTAGTTG 14220 

AGGTTTCTCT AGCTCAATCT TGTGAGGAAG CTTAACCTCA AGCTCGCCCT CCATCTTGTA 14 280 

GAGAGCCTTG 7CACTAGCCT TGTCATTGGT TCCCTGATGA TAAGGGCTGG CTGTCATGAT 14 340 

GGCAGGGATT TTTCCATCAA AACGAGGGCG AATAATGCTA ACCTTTACTA GGTCTGATAG 14400 

CCCTTTTTGG TCAGTATCGA CACGAGACTC AACGTAAACG ACTTCACGAA TGACATCCTG 14460 

GTTAGAAAAA GTAGCCAAAC TCTTGCCGTT AAAGTAGTGG TAGTCATTAT CCTCCGGAAT 14520 

AAGACCATCA CTAACAAGTT CGTCGATAAG ACTATTTCCT TTTTTGGTGC GAGTATTGAG 14 580 

TAACTCATAG AGATTTTCAA TCAAGTCACC ATATATAATG GGAAATCCAG TTTCTTTACC 14 640 

AAAAACGTCA CTATCTTCGA AGTCAACCAA ATAAGAAAAG CCTAAAAGTT GAAAAGCAAC 14700 

AGTATAAAAA ATATCTCCTC TCAGTTCATC TTCTGATTGA AAAAATGTCA GCAGGTC7GT 14760 

TTTTTTATCA GCTGCTAGGA TAGAAAGTGG GTAGTTGGTG TCTTGATAAG TGAAAAAGAA 14820 

ACGACGTAAA AAGGTTTCAA GTGAGTCTTT GTGATTGGCT GTATTTTGTA AATCAAAGCC 14880 

ACATTTTTTT AGTTCAGATA AGACATTTTC TTTTGGAAAA TTGATATAAC TATATTCATT 14 940 

AAAACGCATA GAACCTCCAT ATAGAATGAC AGTTAAGGTT ATTATATCAA AAAAAAAGCA 15000 

GAAAGGGAAT TGTTAACTTC AAAAGGAAAT AATCCAATAA AAATGAATAA AGTACTAAAT 15060 

TCAATATAGA GAACAGACTA ACAATAAGAA TAAATAGATA GGGTATAAAA GTTCTAGGAG 15120 

ATTTATATTA TATGCTTTCT ATTTTTATAT ACAATATAGT ATAAATATAA AAATCATGAC 15180 

AAAAATACAA ATGAATAGAA AATAAATTAG TAAGCTGATG AAATTTTTCT CAAGAGAAGC 15240 

CATTTATAGG TGAAAATGGT ATAATATAGT GAGAAGGATA GAGGAGAAGT GTAAATTGAT 15300 
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AGCGTCATTT CGATCGAAAA 
GCTATGATGG ATATTGACAA 
TACGGCATTC ATCCTTTGCT 
AATTTGCGCT TTTTACCTCT 
GCCAAGATGC ACGGGGAGAA 
GTCATTGTGC CTTATTTTGA 



464 

CGCACAACTA GATACAAAAA CAGTCTATAG TTTTATGGAA 
GTATGTGAGA GCAGCTAAAG AATACGGCTA CACTCATTTG 
TCTTTATGGC GCTTTCGACT TTCTAGAGAT TACAAAAAAA 
AGGGCTTGAA ATGACAGTGT TTGTAGATGA TCAGGGAGTG 
ATCTAGTGTG GGCTATCAGC AGTTGATGAA GCTTTCGACA 
AACTTGGTCA GTCCTGTCCC AGTACCTGGA GGATATCGCG 
TAGAGTTGAG TCGTTAGAAC TAGGCTGTGA TTACTATATA GGGGTTTATC CAGAAACACT 
AGCAAGCGAA TTTCATCATC CTATCTTACC TCTTTATCGG GTCAACCCTT TTGAAAGCAG 
GGATAGAGAA GTTCTTCAAG TTTTAACAGC GATTAAAGAA AATCTACCGC TCAGAGAAGT 
TCCCTTGCGT TCGAGACAAG ATCTCTTTAT ATCAGCAAGT TCTTTAGACA AACTATTCCA 
AGAGCGTTTT CCGCAACCTT TGGACAATTT AGAAAAGCTT ATTTCAGGCA TTTCTTACGA 
CTTGGATACT AGTCTGAAAC TGCCTCGTTT TAATCCAGCT AGACCAGCAG TAGAGGAGTT 



GAGAGAGCGT GCTGAACTGG 
TAGACTAGAC CAAGAATTGT 
TGTTTGGGAT TTGTTGCCTT 
TTCTGCAGTA GGCAGTTTGG 
GAAAAATCTG ATTTTTGAAC 
TATTGATATC CCAGATATTT 
TAGTAAACAT GCGGCACAAA 
AGATGTCTTG AAACGCTTTG 
CAGTTTTCCT GACAATC7TA 
CAATAGTAAG TTAGAATACC 
AAGGCAAACC TCTGTCCATG 
CATTCCTCTA AAGTATGGTG 
GGCTAGCCGA CTTTTGAAGA 
GATGCAAGAG TTGCTTGCTG 



GGCTTGTTCA GAAGGGGTTG ACT ACT AAAG AATATCAAGA 
CTGTTATTCA TGATATGGGC TTTGATGATT ATTTCTTGGT 
TTGGACAATC GAATGGCTAT TATATGGGAA TGGGAAGCGG 
TTTCTTATGC CTTAGACATC ACGGGGATTG ACCCAGTAGA 
GCTTTCTTAA TCGTGAACGC TATACCATGC CTGATATTGA 
ATCGTCCAGA TTTTATCAGA T ATGTTGGT A ATAAATATGG 
TCGTTACTTT TTCAACCTTT GGAGCCAACC AAGCTCTTCG 
GTGTGCCAGA GTATGAATTA TCTGCAATTA CTAAGAAAAT 
AGTCGGCCTA TGAGGGAAAT CTCCAGTTTC GTCAGCAAAT 
AAAAAGCTTT TGAGATTGCT TGCAAGATAG AGGGCTATCC 
CGGCTGGTGT TGTAATTAGT GACCAAGATT TAACCAACTA 
ATGAAATTCC ACTGACTCAG TATGATGCTC ATGGAGTTGA 
TGGACTTTCT GGGACTACGA AATTTGACCT TTGTCCAGAA 
AAACAGAAGG TATTCATCTG AAAATTGAAG AAATCGATTT 



AGAAGACAAA GAAACGTTAG CTTTATTTGC CTCTGGTAAT ACAAAAGGTA TCTTTCAATT 
TGAGCAACCA GGTGCCATTC GTCTGCTTAA GCGTGTGCAA CCAGTCTGTT TTGAAGATCT 
CGTCGCGACT ACTTCTCTAA ATCGACCGGG TGCTAGTGAC TATATCAATA ATTTTGTGGC 
AAGAAAGCAT GGGCAGCAAG AAGTGACTCT TCTGGATCCA GTACTGGAGG ATATTTTGGC 



15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16560 

16620 

166B0 

16740 

16800 

16860 

16920 

16980 

17040 

17100 
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TCCAACCTAC 


GGCATAATGC 


TCTATCAGGA 


GCAGGTTATG 


CAGGTTGCCC 


AGCGACTTGC 


17160 


CCGATTTAGT 


CTTGGGAAAG 


CCGATATTTT 


GCGTCGGGCT 


ATGGGGAAAA 


AGGATGCCTC 


17220 


TGCCATGCAT 


GAGATGAGGG 


CTTCCTTTAT 


TCAAGGTTCA 


TTAGAAGCTG 


GTCATACTGT 


17280 


GGAAAAAGCA 


GAGCAGGTCT 


TTGATGTTAT 


GGAGAAGTTT 


GCAGGTTATG 


GTTTTAACAG 


17340 


GTCACACGCC 


TATGCCTACT 


CAGCCTTGGC 


CTTCCAGTTG 


GCTTATTTCA 


AAACGCATTA 


17400 


TCCAGCCATT 


TTTTATCAGG 


TCATGTTAAA 


TTCTTCCAAC 


AGTGATTACT 


TAATAGATGC 


17460 


ACTTGAAGCA 


GGTTTTGAAG 


TAGCCTCTCT 


ATCCATCAAC 


ACCATTCCCT 


ATCACGATAA 


17520 


AATTGCCAAC 


AAGGCCATCT 


ATCTAGGTTT 


GAAATCCATT 


AAAGGAGTCA 


GTAATGATTT 


17580 


AGCTCTCTGG 


ATTATTGAAA 


ATAGACCTTA 


TTCTAACATT 


GAAGATTTTA 


TAGCTAAATT 


17640 


ACCTGAGAAT 


TATCTGAAAC 


TTCCTCTGCT 


AGAACCTTTC 


GTAAAAGTTG 


GTCTTTTCGA 


17700 


TTCATTTGAA 


AAAAATCGTC 


AAAAAGTATT 


TAATAACTTA 


GCTAATCTAT 


TTGAATTTGT 


17760 


GAAAGAGTTG 


GGAAGTTTGT 


TTGGAGATGC 


TATTTATAGT 


TGGCAGGAAT 


CGGAAGATTG 


17820 


GACGGAACAA 


CAAAAATTTT 


ATATGGAACA 


AGAGCTTTTA 


GGGATAGGTG 


TCAGCAAACA 


17880 


TCCACTACAA 


GCTATTGCAA 


GTAAGGCTAT 


TTACCCGATT 


ACCCCAATCG 


GAAATTTGTC 


17940 


AGAAAATAGC 


TATGCTATTA 


TCTTGGTTGA 


AGTTCAGAAA 


ATAAAACTGA 


TTCGTACCAA 


18000 


AAAGGGTGAA 


AATATGGCCT 


TCTTACAGGC 


AGATGATAGT 


AAGAAAAAAT 


TGGATGTCAC 


18060 


TCTCTTTTCA 


GACTTATATC 


GTCAGGTTGG 


ACAGCAAATA 


AAAGACGGAG 


CCTTCTACTA 


18120 


TGTAAAAGGA 


AAAATACAAT 


CACGTGATGG 


CCGTCTGCAA 


ATGATTGCAC 


AAGAAATAAG 


18180 


AGAAGCAGTT 


GCTGAACGCT 


TTTGGATACA 


GCTGAAAAAT 


CATGAATCGG 


ATCAAGAAAT 


18240 


TTCACGCATT 


TTAGAACAAT 


TTAAAGGCCC 


AATCCCAGTC 


ATCATCCGGT 


ATGAAGAGGA 


18300 


ACAGAAAACC 


ATCGTTTCTC 


CCCATCATTT 


TGTAGCTAAA 


TCCAATGAAT 


TAGAGGAGAA 


18360 


ATTGAATGAA 


ATCGTTATGA 


AAACGATTTA 


TCGCTAAAAA 


TACGGAAAAT 


AGAAGAATTT 


18420 


TCAACGTAAA 


TGTGGTATAA 


TCAGTAAGAA 


TGTTAAAAGA 


AAAAGGAGCA 


TAACCAATAT 


18480 


GAAACGTATT 


GCTGTTTTGA 


CTAGTGGTGG 


AGACGCCCCT 


GGTATGAACG 


CTGCCATCCG 


18540 


TGCAGTTGTT 


CGTCAAGCAA 


TTTCAGAAGG 


AATGGAAGTT 


TTTGGTATCT 


ATGACGGATA 


18600 


TGCTGGTATG 


GTTGCCGGTG 


AAATTCATCC 


CCTAGATGCA 


GCTTCAGTAG 


GGGACATCAT 


18660 


TTCTCGTGGT 


GGTACTTTCC 


TTCACTCAGC 


TCGTTACCCA 


GAGTTCGCTC 


AACTTGAAGG 


18720 


GCAACTTAAA 


GGGATTGAGC 


AATTGAAAAA 


ACACGGAATT 


GAAGGTGTAG 


TTGTTATCGG 


18780 


TGGTGACGGA 


TCTTACCACG 


GCGCTATGCG 


TTTGACTGAA 


CATGGCTTCC 


CAGCTATTGG 


18840 



WO 98/18931 



PCT/US97/19588 



4 66 

TCTTCCAGGT ACAATCGATA ACGATATCGT TGGTACTGAC TTCAC\ TCG GTTTTGACAC 
AGCGGTTACT ACTGCCATGG ACGCTATCGA TAAGATTCGT GATACATCAT CAAGTCACCG 
TCGTACTTTT GTAATCGAAG TTATGGGACG TAACGCTGGT GATATCGCTC TTTGGGCTGG 
TATTGCAACT GGTGCTGATG AAATCATCAT CCCTGAAGCA GGCTTCAAGA TGGAAGATAT 
CGTAGCAAGC ATCAAAGCTG GTTATGAATG TGGTAAAAAA CACAATATTA TCGTCTTAGC 
TGAAGGTGTG ATGTCAGCGG CTGAATTTGG TCAAAAACTT AAAGAAGCTG GAGATACAAG 
CGACCTTCGT GTAACAGAAC TTGGACATAT TCAACGTGGT GGTTCTCCAA CTGCGCGTGA 
CCGTGTTTTG GCGTCACGTA TGGGTGCACA TGCTGTTAAA CTTCTTAAAG AAGGTATCGG 
TGGTGTTGCG GTTGGTATTC GTAACGAAAA AATGGTTGAA AATCCAATTC TTGGTACTGC 
AGAAGAAGGG GCATTGTTTA GCCTTACTGC AGAAGGTAAG ATTGTGGTTA ACAACCCAGC 
TACAAA 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16593 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19360 
19440 
19446 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



TCGTAAATAT 


GCTCTGTTTT 


TGGA'rrrTGT 


TTCTTAATCT 


GTTTGGCAAG 


TGCCTTCATC 


60 


ATAGAAATAG 


GACCACACAT 


ATAGACGGTT 


GCATGTTCGG 


GCACTTCTTT 


TTGTTCAAAA 


120 


TTAAGATAGC 


CGTCTTTCGT 


ACTGTCGATT 


AGATGGAGTT 


CAAAATTAGG 


ATTTTTCTGA 


180 


GCATAGTTAC 


GGAGTAAATC 


TAGGTAGACT 


GCATTTTCAT 


CTCCACGGAA 


GCTATAGTAG 


240 


AAGTGAACCT 


GTTTATCTAA 


AATAGGATGT 


TCACGGATGT 


AAGAGATGAA 


GGGGGTGATC 


300 


CCAATACCTC 


CAGCAATCCA 


AACCTGATTT 


TCTCGTCCTT 


CTTCTATGAT 


CATGTGTCCG 


360 


TAAGCTCTGT 


CTAGGGTTAC 


TTTGCTGCCG 


GCTTGAAGAT 


TATCATAGAT 


ATTCTTGGTA 


420 


TGGTCGCCTG 


AAGTTTTAAC 


AGTAAAGTAA 


AGAGTTTGAC 


CATGACCTCC 


TGAGATAGAA 


480 


AAGGGATGCG 


GAGCACTTTC 


AAAGCCTTCT 


TGGAAAATCT 


TTAGAAAGGC 


AAATTGTCCT 


540 


GATTGATAGT 


TGAAAGGTCT 


GCTAAGATGG 


ATTTGAATTT 


CTCTAGTATC 


GTGATTTAAG 


600 


CGTTTGAGAT 


GGGTAATTTT 


CCCTAGATAG 


GGGAAGGAAA 


TCTTTTGATA 


TAGAAAAATG 


660 


ATATAAAAAC 


CAGCTAGTAA 


GCCTAAAAGG 


GCATAGCTAC 


CAACAAGAAA 


ACTTAGAAGA 


720 


TTAAATGTAA 


GGAGACGATT 


GCCCATTATC 


ATGTAGATGT 


GAAAGAGTCC 


TAAAATATAG 


780 
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GCTAGGTAAA CCAGGCGGTG AATCCATCGC CAAGCTTCGT ATTGGATGTA TTTGCCTAAA 840 

TAGGCGACAA GGATGATGCT GGCAAAGATA TAGATGGCAA GATTGCCAAA CTGAGCAGCT 900 

AAGCGAGAGC CCCACAAACC GCCCATACTA AAGTTATGAA AGATTAGTAG GATGATTGAG 960 

AGAAAGGCTG TGAATTTGTG GACGGTGTAG ACCTTCTCCA AACTGTGAAA CCAGCTTTCT 1020 

AGTAGTGGGA GACGAGTGGC TAGGATAAAA GTCAGAGATA GGCTTGTTAA AGCTAGTCCT 1080 

GCAATCATGA ATTGGGGAGA AGTGTTCATC CAAGTCAAAA GAGTCAAGAT AAAACTAGCT 1140 

ATGATAAAGA GTAGTCCTTT GACTGATTTC ATAGAAAATT CCATTTCATT TAGATTTCGA 1200 

TTTGTTGTAA ATAAATTTGT TACATTTTAT CATAGAAAAT GTATGGTGTC AAATTGAGGT 1260 

CTATAAATAT CTACTCTCAT CAAAAAACTC TCCAATTGAA CTGGAGAGTG GCTGTTTATA 1320 

CTCAATGAAA ATCAAAGAGC AAACTAGGAA GCTAGCCGCA AGTTGCTCAA AACACTGTTT 13 BO 

TGAGGTTGCA GATAGAGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTGT TATTCTGCAG 1440 

CTTGTTGCCA ACGTTTGCCT AGCATATGAG ACAGGCTAGA AATTGCTAGG TTAAAGCTGA 1500 

AGTAGATGAG GGCAATCAGG ATGTAAAGAC TGAAGACCTG CTCTGGTTCG AAATAACGGC 1560 

CCATGAGAAT TTGGCTGGCT CCAAAGAGTT CTTGTAGGGC GATAACAGAG TAGAGGAGAC 1620 

TGGTATCCTT AATCACGGTA ACAAACTGAG AAATGATGGC TGGTAGCATT TTGCGGATGG 1680 

CTTGTGGGAG AATGATGTAG TAGAGGATTT GCGCTGAGGT GAAGCCTTGT GACATTCCTG 1740 

CTTCGTACTG TCC CTTGTCT ACGGCATTGA GACCGCCTCG AATAATCTCA GCCAAGGCTG 1800 

CTGATGTAAA GAGAGTAAAG GCTGTAATAC CTGCTGGTGT GGATTTCATT TTGAACACCA 1860 

AAAAGATAGT AAAAATCCAG AGAAGGTTGG GAACGTTGCG CACAAACTCG ATATAAATAC 1920 

TGGAAATAAT GCGTAAGACA GCATTTTTGC CATTTCTCGT GACAGCTAGC ACCGTACCGA 1980 

TGATAGTAGA GAGGATGATG GCAATCACAG AAATATAGAG GGTCAAGCCA AATCCTTTAA 2040 

AGATAAAGAC TAGGTTATCT GGGGTTAAAA CTTCTAAAAT AGATTCCATA GTAACCTCCT 2100 

AAAGTGAATA GGCTTTTTTG TTGGCTTCCT CCATCTTGCG ACCAAACTGG GCAACAGGGA 2160 

AGCATAGAGC AAAGTAGAGA AGAGCAGCAC CTAAAAAGGC TGGTATATAG TTTCCGTTGA 2220 

GAGCCGACCA AGACTTAGTC ACAAACATCA AGTCTACTCC AGAGATGATA GCTACACTAG 2280 

AGGTCTTCTT GATGAGGTTA ACAATTTGGT TGGTCAATGG AGGGAGAATG ATGCGGAAGG 2340 

CCTGAGGCAA GATAATCAAG CGCATGGCAC TGATATAGGT AAAACCTTGC GACAAGGCGG 2400 

CCTCCATCTG ACCACTAGGA ATAGACTGAA TCCCTGAACG AATAACCTCA GCGATATAAG 2460 

CGCCGTGATA GAGTCCCACG CAGAGAACGG CTGTCCAATA AATTGGAATC ATGATGATAT 2520 



WO 98/18931 



PCT/US97/19588 



468 

GCTCACTGAT AAGAGGTAGG CCATAAAAAA CAATAACAAA CTGCACCAAG AGGGGAGTAT 2580 

TTTGGTAAAA TTCAACAAAG ATGCGAGCTA AAATGCGTAA AATTGGACGT TTACTGGTTG 2640 

ACATGGCACC AAAGAAGATG CCCAAAACCA TAGCGAGGAT AAAGGAACCA ACCGCTAGGG 2700 

CAAGGGTGAA GAGGAAACCA TTGAAAAATT GTCCAAAATC CTGAAAATAG GCTGTCCAAG 2760 

ATGATAAATC TGTCATGGGG TGTCCTCCTT AATCTGCAGT ATGGCTAGAT GGTTTGAGCT 2820 

TGTAACGGTC ATAAAGTTTC TGCAAACTAC CATCCTTGCT CCATTTAGTA ACCAAGTTAT 2880 

CAAGATAGTC GTTGAGCTCT GTATTTGATT TCTTGGTAAC AATACCGTAG TCAGATGGCT 2940 

TGAAACTATC ATCTAGTAGT GCTGTCCGTT TACTAGTGTA GCCAGATAGA ATAGAGCGGT 3000 

CAACGGAAAA GGTATCGATA CGATGAGCGT GCAGGGAAGT AATCAATTCT GGGTAGGAAC 3060 

CAAGTTCGAC GAATTTAAAC TTCAGACCTT TCTTTTTACC CAGTTCAGTA ATCAGGCGTT 3120 

GGGTGATAGA ACCTTGGGCG ACTCCGATGG TTTTGCCGTT TAGGTCCTCA ATCTTTTTGA 3180 

TTTTGGCAGA TTTATTGACC AAAAATCCAG AAGCGTCTGT GTAGTAGGGA CTGGTAAAGT 3240 

* 

TGTAGAGTTT TTTGCGTTCG TCCGTGATGG TAAAGGTCCC GATATCCATA TCGACCTGTT 3300 

CATTGTCTAG AAGGGGGCCG CGGGTTTGTG CTGTAACCGG CACATACCGA ATCTTGACCT 3360 

TGAGTTCATC AGCTACCATC TTGGCCAAGT CGGTTTCGAT ACCAGAATAA GTACCGGTCT 3420 

TGGGATCTTT GTAACCAAAA TTGGGAACGT CTTGTTTGAC ACCGACAACC AGTTCGCCTC 3480 

TTTTTTGAAT GTCTGCGATA CTTGTATCAG CCTGGACTGG TTTGGCAGCA GCAAGGCCGA 3540 

AAAGGCTAAT CAATAATGCT GATAAAAAGA ATTTTTTTTC ATAGGCGCCT CCTTATTTGA 3600 

CTTTCTCACT TTCGTGGTTG ATAATTTTGC TGAGGAATTG TTGGGCACGA GGTTCGCTTG 3660 

GATTGTCAAA AAAGTTATCG ACATCTGTCG TATCTACTAA AACTTCTCCC TCGGCCATAA 3720 

AGATAATGCG GTCCGCAACC TCTCGAGCAA AGCCCATTTC GTGGGTAACC ATGATCATGT 3780 

TCATCCCATC ATGCGCCAGT TTCTGCATAA CTGCTAGAAC ATCTCCCATA GTCTCAGGAT 3840 

CAAGAGCAGA TGTTGGTTCA TCAAAGAGGA GGAGTTCCGG ATGCATAGCA AGACCACGAG 3900 

CGATGGCGAT CCGCTGTTTT TGTCCACCAG ATAGCATGCC GGGATAGCAA TCTTTCTTGT 3960 

CCCACATATT TACAAATTCC AGATATTTTT GGGCGGTTTT TTCAGCTTCT TTTTTATCAA 4020 

TTCCTAGAAC TTCAATGGGT GCAAGCGTTA CGTTTTCTAA CACAGCTTTG TGTGGATAAA 4080 

GGTTAAAATG TTGAAAAACC ATGCCGACTT CCTTGCGAAG AGGTACCAAA TCTTTCTGGC 4140 

TGGCACCAGC AACTTGGTGC CCATTGACTA GGAGACTTCC TTTCTCAACA GTCTCTAAAC 4200 

CATTGATCGT ACGGATAAGA GTGGACTTCC CAGAGCCAGA AGGTCCAAGC AGGACAACAA 4260 

CTTGTCCTTT TTCAAAACGG AGATTGATGT TGCGGAATGC GTGGTAGTCT CCGTAATATT 4 320 
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TTTCGACGTT TTTAAATTCT ACTAAAGCCA TGAGAGATCT CTATTGTGTT ATATTTTATA 4380 

ACACGGTTCT ACAATAAAAG AATGTTCTTG TCAAATCATA TCTGAAAAAA TTCACTATAG 4440 

TGAAATAAGA ACAGGAAAAA TCGATCGGCA CAGTCAAATC GATTTCTAAC AATATTTTAG 4500 

AAGTAGAGGT GTACTATTCT AGTTTCAATA TACTATAAAA TGTTATAAAA AAGCAATCTC 4560 

GATAGAGAAA ACGTCTAAAT CATGTTATAA TGAAGCAATA GAATTCTTAG AAAGAGTGGA 4620 

TGTCTTTTTC ATAACACCTA CTTATGAATG GC AGTTTGCC CTGCAGGTAG AAGATGCGGA 4 680 

TTTTACAAAG ATAGCCAAGA AGGCTGGACT GGGTCCTGAG GTGGCTCGGT TATTGTTTGA 4740 

GAGAGGGATT CAGAACCAAG AAAGTCTGAA GAAGTTTTTA GAACCTTCCT TGGAGGACTT 4800 

ACATGATGCT TATCTGCTCC ATGATATGGA CAAGGCAGTG GAGCGGATTC GTCAGGCTAT 4860 

TGAAGAAGGG GAAAATATTC TTGTTTATGG AGACTATGAT GCGGATGGCA TGACTTCGGC 4920 

TTCTATTGTG AAGGAAAGTT TGGAACAACT TGGTGCTGAG TGCCGAGTTT ACCTGCCAAA 4980 

TCGTTTTACC GATGGCTATG GCCCTAATGC TAGTGTTTAT AAATACTTTA TCGAGCAAGA 5040 

AGGGATTTCC TTGATTGTGA CGGTGGACAA TCGGGTTGCT GGTCATGAGG CTATTGCATT 5100 

CGCTCAGTCT ATGGGAGTAG ATGTCATTGT GACAGACCAT CATTCCATGC CTGAAACCCT 5160 

GCCAGATGCT TATGCTATTG TCCATCCTGA ACATCCACAT GCGCATTATC CTTTTAAATA 5220 

TTTGGCTGGT TGTGGAGTTC CTTTCAAGTT GGCTTGTGCC CTGTTAGAAG AAGTGCAAGT 5280 

GGAATTGCTT GATTTGGTCG CTATTGGAAC TATTGCAGAT ATGGTGAGTC TGACGGATGA 5340 

AAATCGTATC TTAGTTCAAT ATGGTCTGGA AATGTTGGGT CATACCCACC GCATTGGTCT 5400 

GCAAGAAATG CTGGACATGG CTGGGATTGC TGCCAACGAA CTAACAGAAG AAACGGTTGG 5460 

TTTCCAGATT GCTCCTCGTT TGAATGCCTT GGGTCGCTTG GATGATCCCA ATCCTGCCAT 5520 

TGATTTGTTG ACTGGATTTG ATGATGAGGA AGCGCATGAG ATTGCCCTTA TGATTCACCA 5580 

GAAAAACGAA GAGCGCAAGG AAATCGTTCA GTCTATCTAT GAAGAAGCCA AGACCATCGT 5640 

GGATCCTGAG AAGAAGGTTC AGGTCTTGGC CAAGGAAGCC TGGAATCCTG GGGTTCTAGG 5700 

AATCGTGGCT GCTCCTTTAT TGGAAGAATT GGGACAGACA GTCATTGTTC TTAATATAGA 5760 

AGACGGTCGT GCCAAGGGCA GTGCTCGTAC TGTGGAAGCG GTCGATATTT TTGAAGCTCT 5820 

GGATCCCCAT CGAGACCTCT TCATCGCCTT TGGAGGTCAT GCAGGTGCAG CGGGTATGAC 5880 

CCTGGAAGTT GAGCAACTCT CAGATTTATC TCAGGTTTTG GAAGATTATG TTCGTGAAAA 5940 

AGGTGCAGAT GCTGGTGGCA AGAATAAGTT AAACCTAGAT GAAGAGTTGG ATTTGGAGGC 6000 

ACTTAGCTTG GAAACGGTCA AAAGTTTTGA ACGTTTAGCT CCTTTTGGAA TGGATAATCA 6060 
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GAAACCTATT 


TTTTATATCA 


AGAATTTTCA 


GGTCGAAAGT 


GCTCGTACTA 


TCGGGGCAGG 


6120 


TAATGCCCAT 


CTAAAGCTGA 


AAATTTCCAA 


GGGTGAGGCG 


AGTTTTGAAG 


TGGTAGCCTT 


6180 


TGGTCAAGCC 


AGATGGGCGA 


CAGAGTTTTC 


TCAAACCAAG 


AATCTAGACT 


TAGCGGTTAA 


6240 


ATTGTCTGTC 


AACCAATGGA 


ATGGCCAAAC 


TGCCCTCCAG 


TTGATGATGG 


TGGATGCGCG 


6300 


AGTGGAAGGT 


GTTCAACTTT 


TTAACATTCG 


TGGAAAAAAT 


GCAGTCTTGC 


CAGAAGGTGT 


6360 


TCCAGTCTTG 


GATTTTCCTG 


GAGAACTGCC 


AAATCTTGCG 


GCTAGTGAAG 


CTGTTGTCGT 


6420 


AAAAAACATT 


CCAGAGGATA 


TTACTCAGCT 


GAAGACCATT 


TTTCAGGAAC 


AGCATTTCTC 


6480 


TGCTGTCTAT 


TTCAAAAATG 


ATATTGACAA 


GGCTTATTAT 


CTGACAGGTT 


ATGGGACTAG 


6540 


AGATCAGTTT 


GCCAAATTGT 


ACAAGACTAT 


TTACCAGTTC 


CCAGAGTTTG 


ATATTCGCTA 


6600 


CAAGCPGAAA 


GATTTGGCTG 


CATATCTTAA 


TATTCAACAA 


ATCTTGCTGG 


TCAAGATGAT 


6660 


TPAAGTATTT 


GAAGAACTAG 


GCTTTGTGAC 


GATAAAAGAT 


GGTGTGATGA 


CAGTCAATAA 


6720 


AGAGGCGCCA 


AAGCGGGAGA 


TAGGAGAAAG 


TCAAATTTAC 


CAAAATCTCA 


AACAAACCGT 


6780 


TAAACACCAA 

x vJ^VV* 


GAAATGATGG 


CGCTGGGTAC 


GGTGCAAGAA 


ATTTATGATT 


TTTTGATGGA 


6340 


AAAAGAGTAG 


AAGTTAGGAA 


AGAGTTGGGA 


AATCAACTCT 


TTTTTGAAAA 


CAGACCTTCA 


6900 


*l*p r p r PfT A A A A T 


CATCAAAAAA 


ATGGTATAAT 


GGTAGGAAAA 


GATTCGGCTG 


AAAGTATCAG 


6960 


A A PTTTT AG A 


ATAAGAGGGT 


AGAATTGCCC 


TATAATCAAG 


ATAAACTAAG 


ATTTTGGAGG 


7020 


AAAAATGAGT 


AATATCAGTT 

fir* 4 *» 4 ^» *» V* A 4 


TAACAACACT 


TGGTGGTGTG 


CGTGAGAATG 


GAAAAAATAT 


7080 


w <* A\>rV A * VJV* 4 


GAAATTGGAG 


AGTCCATTTT 


TGTTTTGAAT 


GTAGGGTTAA 


AATATCCTGA 


7140 


AAATGAACAA 


TTAGGGGTCG 


ATGTGGTGAT 


TCCAAACATG 


GATTACCTTT 


TTGAAAATAG 


7200 


CGACCGTATT 


GCTGGGGTTT 


TCTTGACCCA 


CGGGCATGCG 


GATGCCATTG 


GTGCTCTACC 


7260 


GTATCTCTTG 


GCAGAGGCTA 


AAGTTCCTGT 


ATTTGGGTCT 


GAGTTGACCA 


TTGAGTTGGC 


7320 


AAAGCTCTTT 


GTCAAAGGAA 


ATGATGCCGT 


TAAGAAATTT 


AATGATTTCC 


ATGTCATTGA 


7380 


TGAGAATACG 


GAGATTGATT 


TTGGTGGGAC 


AGTGGTTTCC 


TTCTTCCCTA 


CGACTTACTC 


7440 


CGTTCCAGAG 


AGTCTGGGAA 


TTGTCTTGAA 


GACATCGGAA 


GGAAGCATCG 


TTTATACAGG 


7500 


TGACTTCAAA 


TTTGACCAAA 


CGGCTAGTGA 


ATCTTATGCA 


ACTGATTTTG 


CTCGTTTGGC 


7560 


AGAGATTGGT 


CGTGACGGCG 


TCCTGGCTCT 


CCTCAGTGAT 


TCGGCCAATG 


CAGACAGCAA 


7620 


TATTCAGGTG 


GCTAGTGAAA 


GTGAAGTTAG 


GGATGAAATT 


ACCCAAACTA 


TTGCTGACTG 


7680 


GGAAGGTCGT 


ATCATCGTTG 


CAGCTGTTTC 


CAGTAATCTT 


TCTCGTATTC 


AGCAGATTTT 


7740 


TGACGCTGCG 


GATAAAACAG 


GTCGACGTAT 


CGTCTTGACA 


GGATTTGATA 


TTGAAAATAT 


7800 


CGTCCGCACA 


GCGATTCGTC 


TTAAGAAGTT 


GTCTTTAGCC 


AACGAAATTC 


TTTTGATTAA 


7860 
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GCCTAAAGAT ATGTCTCGCT TTGAAGACCA TGAGTTGATT ATTCTTGAGA CAGGTCGTAT 
GGGTGAGCCT ATCAATGGAC TTCGTAAGAT GTCGATTGGT CGCCATCGTT ATGTAGAAAT 
CAAGGATGGG GACCTAGTCT ATATTGCTAC GGCTCCGTCT ATTGCTAAAG AAGCCTTTGT 
TGCGCGTGTG GAAAATATGA TTTATCAGGC AGGTGGGGTT GTCAAATTGA TTACCCAAAG 
TTTACATGTA TCAGGGCACG GAAATGTGCG TGATTTGCAG CTGATGATCA ATCTTTTGCA 
ACCTAAGTAC CTCTTCCCTG TCCAAGGGGA GTATCGTGAG TTGCATGCTC ACGCTAAGGC 
TGCCATGGCA GTTGGGATGT TGCCAGAACG CATCTTCATT CCTAAAAAGG GGACGACCAT 
GGCTTACGAG AATGGAGACT TTGTTCCAGC TGGATCGCTT TCAGCAGGAG ATATCTTGAT 
TGATGGGAAT GCCATTGGTG ATGTTGGAAA TGTTGTTCTT CGTGACCGTA AGGTCTTGTC 
AGAGGATGGA ATTTTCATCG TGGCTATTAC AGTCAACCGT CGTGAGAAGA AAATTGTGGC 
TAGGGCTCGT GTTCACACGC GTGGATTTGT TTATCTCAAG AAGAGTCGCG ATATTCTCCG 
TGAAAGTTCA GAATTGATTA ACCAAACGGT AGAAGAGTAT CTTCAAGGAG ATGACTTTGA 
CTGGGCAGAT CTCAAAGGTA AGGTTCGTGA CAATCTGACC AAGTACCTCT TTGATCAAAC 
CAAGCGTCCC CCAGCCATTT TACCAGTAGT CATGGAAGCA AAATAATCGT TGAAATAAAC 
AGAGAGAAAG TCGAGTTTCG GCTTTTTCTT ATAGAAAAAT AGAAGGAGAA AATCATGGCA 
GTGATGAAAA TCGAGTATTA CTCACAAGTA TTGGATATGG AGTGGGGGGT GAATCTCCTC 
TACCCTGATG CCAATCGAGT GGAAGAACCA GAGTGTGAAG ATATTCCCGT CTTCTACCTT 
TTGCACGGCA TGTCTGGAAA TCATAATAGT TGGCTTAACC GGACCAATGT AGAACGCTTG 
CTTCGAGGAA CTAATCTCAT CGTTGTTATG CCCAATACCA GCAATGGTTG GTACACCGAT 
ACCCAGTATG GTTTTCACTA CTACACGGC? CTAGCAGAGG AATTGCCACA GGTTCTGAAA 
CGCTTCTTCC CTAATATGAC GAGCAAGCGT GAAAAGACCT TTATCGCTGG TCTTTCTATC 
GGAGGCTACG GCTGCTTCAA ACTGGCTCTT ACGACAAATC GTTTTTCTCA TGCAGCTAGT 
TTTTCAGGTG CCCTCAGCTT TCAAAACTTT TCTCCTGAAA GTCAAAATCT GGGAAGTCCA 
GCCTACTGGA GAGGTGTTTT TGGAGAGATT AGAGACTGGA CAACTAGTCC CTATTCTCTT 
GAAAGTCTGG CTAAAAAATC GGATAAAAAG ACCAAACTTT GGGCGTGGTG TGGCGAACAG 
GATTTCTTGT ACGAAGCCAA TAATCTCGCA GTGAAAAATC TCAAAAAACT AGGTTTTGAT 
GTGACCTATA GCCATAGCGC TGGAACTCAC GAGTGGTACT ACTGGGAAAA ACAATTGGAA 
GTTTTTTTAA CAACCCTACC AATTGATTTC AAATTAGAAG AGAGACTGAC TTAGTTTGAA 
CTTCAGCATA GGGCGAGTAG AACTAAAATA AAATATCTTT TC ACT AG ACT TTTCAAACGm 
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AAGTAGTAGA ATAGTAATAA AATACTGGAG GAAAGAGAGT AGGAAATGTA CCGTTATCAA 9660 

ATTGGCATTC CCACATTAGA ATATGATCAG TTTGTCAAAG AACATGAATT AGCCAATGTA 97 20 

TTACAAAGTA GTGCTTGGGA GGAAGTTAAG TCTAATTGGC AACATGAGAA GTTTGGTGTT 9780 

TACAGGGAAG AAAAATTACT GGCGACAGCT AGTATTTTGA TTAGAACTCT TCCGCTAGGC 9840 

TATAAAATGT TTTACATCCC AAGAGGACCT ATATTGGATT ATGGGGATAA AGAACTCTTG 9900 

AATTTTGCCA TTCAGTCTAT TAAGTCCTAT GCTCGCAGTA AGAGAGCGGT TTTTGTGACT 9960 

TTTGACCCAA GTATTTGCCT ATCTCAAAGT TTAATCAATC AGGAAAAGAC AGAATTTCCT 10020 

GAAAATCTGG CTATTATTGA TAGTTTGCAA CAAATGGGAG TAAGGTGGTC AGGAAAAACG 10080 

GAGGAAATGG GAGACACCAT TCAACCTCGT ATTCAGGCGA AAATATACAA GGAAAATTTT 10140 

GAAGAAGATA AACTTTCCAA GTCAACAAAA CAGGCTATTC GAACAGCACG AAACAAAGGG 10200 

CTTGAGATTC AATATGGTGG ACTGGAACTA TTAGATTCAT TTTCGGAGTT GATGAAAAAA 10260 

ACTGAGAAGC GAAAAGAGAT TCATTTGAGG AATGAAGCCT ATTATAAAAA ATTGTTAGAT 10320 

AATTTTAAGG ACAAGGCCTA TATCACCTTG GCCACCTTGG ATGTTTCTAA ACGTTCGCAA 10380 

GAGTTAGAAG AACAGTTAGC GAAAAATAGA GCCTTGGAAG AGACCTTTAC TGAGTCGACT 10440 

CGAACTTCAA AAGTAGAAGC GCAGAAGAAG GAAAAAGAAC GTTTGTTAGA GGAATTGACC 10500 

TTCTTGCAGG AATATATAGA TGTAGGTCAA GCGAGAGTTC CTTTAGCGGC TACTTTGAGT 10560 

TTGGAATTTG GTACTACCTC TGTCAATATA TATGCTGGTA TGGATGATGA TTTTAAACGT 10620 

TACAATGCAC CAATTTTAAC TTGGTATGAA ACGGCTCGCT ATGCCTTTGA ACGAGGTATG 10680 

ATCTGGCAAA ATTTAGGTGG TGTTGAAAAC TCTCTCAATG GTGGACTTTA TCATTTTAAG 10740 

GAAAAATTTA ATCCAACGAT TGAAGAATAC TTGGGTGAAT TTACAATGCC CACTCATCCT 10800 

CTCTATCCTC TGTTAAGACT TGCTCTTGAT TTCCGTAAAA CATTAAGAAA AAAACATAGA 10860 

AAGTAAGTAT ATGGCACTAA CAACACTCAC GAAAGAAGAG TTTCAGACTT ATTCTGATCA 10920 

GGTTTCTTCT CCTTCCTTTA TGCAATCTGT CCAGATGGGG GATTTGCTAG AAAAAAGAGG 10980 

GGCTCGAATT GTTTATCTTG CTTTGAAACA AGAAGGAGAA ATTCAAGTTG CAGCTCTGGT 11040 

TTATAGCCTG CCCATGCTGG GTGGTCTGCA TATGGAACTC AATTCGGGGC CGATTTATAC 11100 

CCAACAAGAT GCTCTTCCAG TTTTTTATGC AGAGTTAAAA GAATATGCCA AGCAAAATGG 11160 

TGTATTAGAG TTGCTTGTAA AACCCTATGA AACTTATCAA ACTTTTGATA GCCAAGGTAA 11220 

TCCAATAGAT GCTGAGAAAA AAAGTATTAT TCAAGATTTG ACTGATTTAG GTTATCAATT 11280 

TGATGGCTTA ACAACAGGTT ACCCAGGTGG AGAACCAGAT TGGTTATACT ATAAAGATTT 11340 

AACTGAATTA ACTGAAAAGA GTTTGCTTAA AAGTTTTAGC AAAAAGGGTA AACCCTTGGT 11400 
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GAAAAAGGCT 


GAAACCTTTG 


CCATTCGGTT 


GAAAAAGTTA 


AAACGTGAAG 


AACTATCGAT 


11460 


TTITAACAAT 


ATAACAAAAG 


AAACCTCTGA 


ACGTAGAGAA 


TATAGTGATA 


AAAGTTTAGA 


11520 


ATATTATGAG 


CATTTTTATG 


ATACTTTTGG 


AGAACAAGCG 


GAGTTTCTCA 


TAGCAAGCTT 


11580 


AAATTTTTCG 


GACTATATGA 


GCAAATTGCA 


AGGTGAACAA 


AGTAAACTAG 


AAGAAAACTT 


11640 


GGACAAGTTG 


CGACTTGATT 


TGAGTAAAAA 


TCCTCATTCT 


GAGAAAAAAC 


AAAATCAACT 


11700 


GAGAGAATAT 


TCTAGTCAAT 


TTGAAACGTT 


TGAAGTTCGA 


AAAGCAGAAG 


CGCGAGACTT 


11760 


GATTGAAAAA 


TATGGAGAAG 


AAGATATTGT 


TTTAGCTGGG 


AGTTTATTTG 


TTTATATGCC 


11820 


TCAGGAAACG 


ACTTATCTCT 


TTAGTGGTTC 


CTACACTGAG 


TTTAATAAGT 


TCTATGCCCC 


11830 


TGCACTGCTT 


CAAAAATATG 


TTATGTTGGA 


AAGCATAAAA 


CGTGGAATAC 


CTAAATACAA 


11940 


CTTCCTAGGC 


ATTCAAGGGA 


TTTTTGATGG 


AAGTGATGGT 


GTTTTGCGTT 


TTAAACAGAA 


12000 


TTTTAATGGC 


TATATTGTAC 


GCAAAGCAGG 


TACTTTCCCT 


TACCATCCAT 


CGCCTTTAAA 


12060 


ATACAAAGCT 


ATCCAGTTAC 


TCAAAAAAAT 


AGTAGGACGT 


TAAGATGAAA 


AAGTCAGTAT 


12120 


TTAGATTTCT 


TTTAGCTTCT 


TTTAGTAAAA 


TAATTCTTAT 


TTGCTAGAAA 


GGTGGAGAGA 


12180 


CATGCGCTGG 


CTTTTTCGTT 


TGATAGGGGC 


TTTCTTTTCT 


TTTGTGTGGC 


GTrrGrrrrc 


12240 


GCGTCTGGTT 


TGGATAGTTG 


TGCTCTTATG 


TGTGCTTGCT 


TTCGGACTTC 


TCTGGTATCT 


12300 


GAACGGAGAT 


TTTCAAGGAG 


CGCTAAAGCA 


AGCAGAACGG 


TCAGTAAAAA 


TTGGTCAACA 


12360 


AAGTATTGAC 


CAATGGGAGA 


AAACAGGGCA 


ACTGCCTAAG 


TTAAGCCAGA 


CAGATAGTCA 


12420 


CCAGCATTCT 


GAAGGAAGGT 


GGGCACAGGC 


CTCTGCTCCT 


ATTTACCTGC 


ATCCGCAGAT 


12480 


GGATTCACCC 


TTTCAAGAGG 


CTTATTTAGA 


AGCAATCCAG 


AACTGGAATC 


AAACTGGTCC 


12540 


TTTTAACTTT 


GAACTCGTGA 


CTGAGTCTAG 


TAAGGCGGAT 


ATTACGGCTA 


CGGAGATGAA 


12600 


CGACGGAGGC 


ACTCCTGTGG 


CAGGACAGGC 


GGAAAGTCAA 


ACTAATCTCT 


TAACAGGCCA 


12660 


ATTCTTGTCC 


GTAACGGTGC 


GGTTGAATCA 


TTATTATTTG 


TCCAATCCAT 


ACTATCGCTA 


12720 


CTCCTATGAA 


CGCCTTGTCC 


ATACGGCAGA 


ACATGAGTTA 


GCTCATGCCA 


TTGGCTTGGA 


12780 


CCATACAGAT 


GAGAAGTCTG 


TCATGCAACC 


AGCAGGTTCC 


TTTTATGGTA 


TCCAGGAAGA 


12840 


GGATGTTGCA 


AACCTCCGAA 


AAATATATGA 


GACTAGTGAG 


TACGGTACTA 


TCTTTCCCTA 


12900 


CTTTTTTTGC 


TATAATGGAA 


CTATGAACAA 


CTTGATTAAA 


TCAAAACTAG 


AGCTCTTGCC 


12960 


GACCAGCCCT 


GGTTGCTACA 


TTCATAAGGA 


TAAAAATGGC 


ACCATTATCT 


ATGTAGGAAA 


13020 


GGCTAAAAAT 


CTGCGTAATC 


GAGTACGGTC 


CTATTTTCCT 


GGAAGTCATG 


ATACCAAGAC 


13080 


AGAGGCTCTG 


GTGTCTGAAA 


TTGTGGATTT 


TGAATTTATT 


GTTACGGAGT 


CTAATATTGA 


13140 
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GGCACTTCTC CTAGAAATCA ACCTGATCAA GGAAAACAAG CCCAACTACA ATATCATGCT 
CAAGGATGAC AAGTCCTATC CTTTCATCAA AATCACCAAT GAGCGCTATC CACGCTTGAT 
TATCACTCGT CAGGTCAAAA AGGACGGAGG TCTTTATTTT GGACCCTATC CCGATGTGGG 
GGCAGCCAAT GAAATCAAGC GGTTGCTGGA TCGGATATTC CCTTTTCGTA AGTGTACCAA 
CCCGCCCTCT AAGGTCTGTT TTTATTACCA TATCGGCCAG TGTATGGCCC ACACCATCTG 
TAAGAAGGAT GAGGCTTATT TCAAGTCTAT GGCCCAGGAG GTGTCTGATT TTCTGAAAGG 
TCAGGATGAC AAAATCATCG ATGATCTCAA GAGTAAAATG GCAGTAGCAG CACAAAGTAT 
GGAGTTTGAA CGTGCGGCGG AATACCGTGA CCTGATTCAG GCTATTGGAA CGCTTCGAAC 
CAAGCAACGG GTCATGGCGA AAGATTTGCA AAATCGCGAT GTCTTTGGCT ACTATGTGGA 
TAAGGGCTGG ATGTGTGTGC AGGTTTTCTT TGTCCGTCAG G £ AAGCTCAT CGAGCGCGAT 
GTCAATCTCT TCCCCTACTT CAATGATCCA GATGAGGATT TTTTGACCTA TGTAGGACAA 
TTCTATCAAG AAAAATCTCA TCTAGTTCCC AATGAGGTAC TGATTCCGCA CATATTGACG 
AAGAAGCTGT CAAGGCTTTG GTGGATTCCA AGATTCTTAA GCCTCAACGT GGAGAGAAAA 
AACAACTGGT CAATCTAGCC ATAAAAAATG CTCGTGTTAG TCTAGAGCAG AAGTTCAATC 
TGCTAGAAAA ATCTGTCGAA AAGACTCAAG GAGCTATTGA AAATCTAGGG CGTTTGCTCC 
AAATCCCGAC CCCAGTACGT ATCGAGTCCT TCGATAACTC TAATATCATG GGAACTAGCC 
CTGTTTCGGC TATGGTGGTC TTTGTCAACG GTAAACCGAG TAAGAAGGAT TACCGTAAGT 
ACAAGATAAA AACGGTTGTT GGACCAGACG ACTATGCCAG CATGAGAGAG GTCATTCGCA 
GACCCTATGG TCGAGTACAG CGTGAGGCTT TGACTCCTCC AGATTTGATT GTGATTGATG 
GGGGGCAAGG TCAAGTCAAT ATCGCTAAGC AGGTTATCCA AGAGGAACTG CGCTTGGATA 



AAGAATGATA AGCACCAAAC CCATGAATTG CTCTTTGGAG 
TTGTCTCGCA ATTCTCAGGA ATTTTTCCTC CTCCAACGCA 
TTTGCTATCA CTTTCCACCG CCAACTGCGC TCCAAAAATT 
GGGATTGACG GTCTGGGACC TAAACGCAAG CAGAATCTTA 
ACCAAAATCA AGGAAGCCAG TGTGGATGAG ATTGTCGAAG 
GCAGAGGCTG TGCAAAGAAA GTTGAACCCG CAGGGAGAAG 
CCTTGCCTCA AGTAGCAGAA GAAAGAGTAG ATTACCAAAC GGAAGGAAAC CACAATGAAC 
CATAAAATCG CAATTTTATC AGATGTTCAT GGCAATGCGA CGGCGCTAGA AGCAGTGATT 
GCAGATGCTA AAAATCAAGG GGCCAGTGAA TATTGGCTTC TGGGAGATAT TTTTCTTCCT 
GGTCCAGGCG CAAATCACTT AGTCGCCCTG CTAAAGGACC TTCCTATCAC AGCAAGTGTT 



TTCCAATTGC 
ATCCGCTTGA 
TCCAAGATGA 
CTTTCTCATC 
TGAAGCATTT 
TTGGGGTACC 



TGGGCTGCAA 
GGTGGTGGAT 
GGTGCACCGC 
TCAATTGGAT 
CAAGTCTTTG 
TAGAGTCGTT 
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CGAGGCAATT GGGATGATCC TGTCCTTGAG GCTTTAGATG GGCAATATGG CTTAGAAGAC 15000 

CCACAGGAAG TTCAGCTCTT GCGTATGACA CAGTATTTGA TGGAGCGAAT GGATCCTGCA 15060 

ACGATTGTCT GGCTACGAAG CTTGCCTTTG CTGGAAAAGA AAGAAATTGA CCGATTGCGC 15120 

TTTTCTATCT CTCATAATTT ACCTGACAAA AACTATGGTG GTGACTTGCT AGTTGAGAAT 15180 

GATACAGAGA AATTTGACCA ACTGCTAGAT GCGGAAACGG ACGTGGCAGT TTATGGTCAT 15240 

GTTCACAAGC AGTTGCTTCG TTATGGAAGT CAAGGGCAAC AAATCATCAA TCCAGGGTCG 15300 

ATTGGCATGC CCTATTTTAA TTGGGAGGCG TTAAAAAATC ACCGTTCCCA GTATGCCGTG 15360 

ATAGAAGTTG AAGATGGGGA ATTACTCAAT ATCCAATTTC GTAAAGTTGC TTATGATTAC 15420 

GAAGCTGAGT TAGAATTGGC CAAGTCCAAG GGGCTTCCCT TTATCGAAAT GTATGAAGAA 154 80 

CTGCGTCGTG ACGATAACTA TCAGGGGCAC AATCTGGAAT TATTAGCCAG CTTAATAGAA 15540 

AAGCATGGGT ATGTAGAGGA TGTGAAGAAT TTTTTTGATT TTTTGTAACA GTTTCCTAAA 15600 

« ATAGCCAATG CAAACTAAAA AAGCGATTTG CTGGTCCAAT CGCTTTTAGT ATATCTTATA 15660 

CTCAATGAAA ATCAAAGAGC AAACTAGGAA GCTAGCCGTA GGTTGCTCAA AGCACAGCTT 15720 

TGAGGTTGCA CATAAAGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTGT TATTGTAACT 15780 

GAGATTGATC TGGGAGGTAA GAACCACCTA GATAGCTATT GCTGAGTTTT TCAAGGGTTC 15840 

CGTCTTGATA GAGTTCTTTG AGCGCTTTAT CAAATTGCTC TTTAAACTCT TTTTGGTCGC 15900 

TTGAGAAAAT GATATAATTG CTGGGGCTAT CTGCAGAAGG TAAATCAACG ACTGAGAGGT 15 960 

CTAAACCACG GTCCTTGATA ATCTTTTGAA CCCATACCTT GTCAAAAACT AGGAAATCAA 1602 0 

ACTCTCCGTT AGCAAGGTCT AGGATTCCTT TACCAATATC CTCACCAGAA AAATTAATTG 16030 

TAGCGGGATT ATCAGTGTGT TTCTGATTCC AGTTATTGAT GAATTGAGCG TTAGAAGTTC 16140 

CGGTATCCTC TTGTGTTGTT TTACCAGCGA TCTGCTCAAG AGAAGTCAAA GGATTTTTCT 16200 

TGTTGCTGAC AAGGACGAGG GGATTGTTGG AAATTGGAAG CGAGTAAAGG TATTTTTCAG 16260 

CACGCTCTTT TGTGTAACTC AAGTTATTGG CCGCAGCCTG ATAGTGACCA GAATCAAGTC 163 20 

CTGGGAAGAT GCTCTCCCAG GCGGTTCTTT GGAATTGAAT CTCGTAGTCG CTGAGTTTTT 16380 

CATCTACTGC CTTTAAAACT TCGATATCAA AGCCTGTCAG ATTGCCCTTG TCTTCGTAGT 16440 

CAAATGGTGG CACGTCGCCA GCTGT AGCAA GGACGATTGT CTTTTGAGCG CTAGTCTCTT 16500 

TGGGTGTAGC TTGATTCTCA CAGGCAACCA AAAATGGTAG GATAGCTAGT AATAGGCTAA 16560 

ATTTTTTCAT ACTGTCTCCA TTCAAATGTA AAG 16593 
(2) INFORMATION FOR SEQ ID NO: 53: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



GGGATATCCT 


TATATCCTTG 


TTCCTGGAAC 


CATTGTGGGA 


ATTGCTCAAC 


AGTTTTTTCA 


60 


C PTTfi AATT C 


CTGGTGCAAT 


GACAGTAAGA 


ATTTCGAAAT 


CACGATCTGG 


TTTCGCCGCT 


120 


AGTTCCATCA 


ACTCTGGCAT 


ACTTTTCTTG 


CATGGACCAC 


ACCATGAAGC 


CCAAAACTTC 


180 


AAlTTAAACCT 


TTTTACCCTT 

14 4 4 f\ W \* V# 4 4 


AAAATCAGAT 


AACTTAACTT 


CTTTGCCATC 


CATGGATTGC 


240 


AATfJTGAAtlT 


CTGGAGCATC 


TTTTCCAACA 


GCAATTTGTT 


GTACAGTCGT 


TTGTTGTTTT 


300 


GGCTCTTGTG 


CTGCTTGAGT 


CTTTTTAGTT 


TCTTCCTCAC 


CACAGGCCAT 


CAATACAACT 


360 


AATGAPAAGA 


GACTTAAGCC 


AGCAAACATT 


AcriTTTTCA 


TTTGTCCTCC 


TTTATTCAAA 


420 


AATTPC AGPT 


AGAACATT^A 


CTTGTCCTAA 


TAGTAACAAA 


ATTCCCATTA 


AAACAATGAG 


480 


GAAACCACCA 


ATTTTCTTTA 

*» 4 4 4 4 ^» 4 4 4 • » 


GTAGCATCAT 


ATGACGCTTG 


ATTTTACTAA 


AATATGGCAT 


540 


GACT AGACCT 


GAAGCTAGTG 


CCAATACCAA 


GAAAGGAAGG 


GCCATGCCaG 


AGTGTAAATG 


600 


AfiAGTATAAA 


TCGCTCCTTG 

4 VW\« 4 Sp" 4 4 


CCAAGCGCCA 


TTCCCTCCAG 


AAGCCGCAAG 


TGCTAAAACA 


660 


GAACTTAAAA 


CTGGACCAAT 


ACAAGGTGTC 


CAACCAAAGC 


TAAAGGTAAT 


ACCAAGTAAA 


720 


AAAGCTGACC 


AATAACGATT 


AGAATCTGAT 


TTTTTAAAGG 


TAAAACTTTT 


TTGAACTTCT 


nan 
7 80 


AATTTCTTCA 


AATGAAAAAT 


TTCCATCTGG 


TGAAGACCCA 


AAATG AT AAT 


AATAGCTCCC 


840 


ATGCCATATC, 


, GAAACCAATT 


TGCATAGAGA 


ATATGACCAA 


AGTAACCAGC 


ACCAAAGCCT 


900 


AGAATAAAGA 


AAATGAGAGA 


GATACCAGCG 


ATAAAGCAAA 


GTGTTCGAAT 


CAAGCCTGAC 


960 


CACAGAACCT 


TTCTCCCAAA 


CAAAGAAAAG 


CTTTTTGCAC 


TTTCTTGATC 


ATCCAATAAA 


1020 


ATCCCAGCAT 


AGACTGGCAG 


AAGAGGAAAA 


ATACAAGGAG 


AAAAAAAGGA 


TAAAACACCT 


1080 


GCTAGAAAAA 


CAGAGATTAA 


AAATACTATC 


GTTTCCAATA 


AAGAACCAAC 


TTTCTTAATA 


1140 


ATTCTAATCC 


TATTTTACTA 


TATTCAATTT 


TATTTGTAAG 


CTTTCTGCTA 


CGCAAAATCG 


1200 


TATCGGGCAC 


TATTGGACCA 


ATCTTTTCTT 


TTGCTAGTCA 


AGGCGGATCT 


TATCCCCCAA 


1260 


AATAGCCAAA 


AAGCAACGAC 


AAGGATTACT 


CATCGCTGCT 


TTTGTGAACG 


AAAATGTCTT 


1320 


TTAGGTCTGA 


CATTTCATAA 


ATCATGTTTT 


ACTTGAGTTT 


GTCAAGGATT 


GCTTTAAGCT 


1380 


CCTCTACTAG 


TTTAGTTTCT 


GTCTCTGCTG 


AGCCATTTTC 


TTCTTTCACG 


AAATCAAGGG 


1440 


TTTCTTGGAG 


AAGGTTTTGG 


GCTTTGGCAA GGACTTTTTT 


ATCCGCTTTT 


TCTGCATCTA 


1500 
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GCTGTCCTAG 


AACCTTGATC 


AATTCCGTGC 


TTAATTGCTG < 


gatt: :gac 


TCTTTCTTAC 


1560 


GGCGAATCAG 


CCAGAAGGCA 


ATCACGCCTA 


GGAGGGCAAG 


TAGACTGACC 


ACAATCACTC 


1620 


CTGCCGGAAC 


TGAGTTTGTT 


TCAGTCATCT 


TATCTGAATC 


CTTACTATCT 


TCCGTTCCTT 


1680 


GTTTTGCATC 


CTTCTTGTCC 


TGTGCAGGCT 


TGCTGTCGCT 


AGCATTTGCT 


TTCACATCTT 


1740 


TGAGAGAGTC 


CAAGGCAGCC 


CAGCCTTCAC 


AGACTCTACT 


GCAGTATGCA 


GACCTTACTC 


1800 


TGTCAAGGCA 


CTATCTTCCG 


GAGCTTTTTG 


AGCATCTAGG 


AGGACAGCCT 


TGGTTGCATC 


1860 


GATTTTCGGA 


TCAGATACTG 


TTGCCAAAGC 


TTTCAAGCGT 


TGGTCTAACT 


CTTGACTCAA 


1920 


GGCACGAAGT 


TCAGACTTGT 


CAACTTGCTC 


TTGAGCTTGT 


GTGCTCGTTG 


AGCTAGCCGA 


1980 


AGCGCTTGCT 


ACCACTCTAG 


GATCTTGAGT 


CGGAGCTGAG 


CTTGGAGCTG 


GGACAGGGCT 


2040 


TGCAGGTTGA 


CTAGGAACAG 


TTATGGTATA 


TTGAAACTAG 


AATAGTACAT 


ATGGACTTCT 


210O 


AAAACATTGT 


TAGAATTCGA 


TTTTACTGTC 


CTGATCGATT 


TGTCCTATTC 


TTATTTCATT 


2160 


TTACTATAAT 


AACCGATGCT 


GTGGTTAATG 


TTGGTAAGAG 


AAACTTCTGA 


AACCAAGCTT 


2220 


CAAAAAAGTC 


GCTCGTCATC 


GTCTCTTCGT 


AAGTCATTGG 


AGCGATTAAT 


TCACCATTTG 


2280 


TTAGACCTGC 


AACCAAAGAA 


ATCCTCTGAT 


ATCTTCTTCC 


AGATACTTTG 


CCTCTTATTA 


2340 


ACTGACCTTT 


TAATGAGCGA 


CCATATTCTC 


GATAAAAATA 


AGTATCGAAT 


CCTGTTTCGT 


2400 


CAATCTAAAC 


AGGTGCTAGG 


TGCTTTAAAC 


TATTAAAATT 


CTTAAGAAAT 


AAGGCTACTT 


2460 


TTTCTGGGTC 


TTGTTCATAG 


TAGGTGTGGT 


TCTTTTTTTC 


GAGTGTAGCC 


CATAGCTTTG 


2520 


AGCGCATAGT 


GGATGGTAGT 


TGGATGACAG 


CCAAAlcTCAG 


AAGCTATTTC 


AGTCAAATAA 


2580 


GCrTCTGGAT 


TGTCAGTAAG 


ATAGTTTTTA 


AGTCTATCTC 


TATCAACTTT 


TCTTGGTTTT 


2640 


GTTCCTTTTA 


CTTGGTGGTT 


TAGCTCTCCT 


GTTTTCTCTT 


TTACCTTTAA 


CCAGCCATAA 


2700 


ATGGTATTAC 


GTGAGATTTG 


GAAAACGTGT 


GATGCTTCTC 


TTATACTACC 


TATTCGCTCA 


2760 


CAATAAGAGA 


GAACTTTTTT 


ACGAAAATCT 


ATTGAATATG 


CCATAAGAAG 


ATTATACCAC 


2820 


ATTGTGTACT 


ATTTTTGGTT 


CATTTCACTA 


TAACACAAAA 


TAGATTATTA 


TTACATAACA 


2880 


AAAAAGAGGT 


CTAAACCTCT 


TAACTCAATT 


ACTCCGCCAG 


TAGGACTCGA 


ACCTACGACA 


2940 


TCATGATTAA 


CAGTCATGCG 


CTACTACCAA 


CTGAGCTATG 


GCGGATTAAA 


GCTAAGCGAC 


3000 


TTCCCTATCT 


CACAGGGGGC 


AACCCCCAAC 


TACTTCCGGC 


GTTCTAGGGC 


TTAACTTCTG 


3060 


TGTTCGGCAT 


GGGTACAGGT 


GTATCTCCTA 


GGCTATCGTC 


ACTTAACTCT 


GAGTAATACC 


3120 


TACTCAAAAT 


TGAATATCTA 


TTCAATTTAA 


GAAAACCGVr 


CGCTTTCATA 


TTCTCAGTTA 


3180 


CTTTGGATAA 


GTCCTCGAGC 


TATTAGTATT 


AGTCCGCTAC 


ATGTGTCGCC 


ACACTTCCAC 


3240 
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TTCTAACCTA TCTACCTGAT CATCTCTCAG GGCTCTTACT GATATATAAT CATGGGAAAT 

CTCATCTTGA CCTGGkTtCA CACTTAGATG CTTTCAGCGT TTATCCCTTC CCTACATAGC 

TACCCAGCGA TGCCTTTGGC AAGACAACTG GTACACCAGC GGTAAGTCCA CTCTGGTCCT 

CTCGTACTAG GAGCAGATCC TCTCAAATTT CCTACGCCCG CGACGGATAG GGACCGAACT 

GTCTCACGAC GTTCTGAACC CACCTCGCGT 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i> SEQUENCE CHARACTERISTICS: 

IA) LENGTH: 20986 base pairs 
(Bi TYPE: nucleic acid 
{C> STRANDEDNESS : double 
(D> TOPOLOGY: linear 



3300 
3360 
3420 
3480 
3510 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CGGAGAAAAA CATGGCTAAG TCAAACTTTG AAAAAGTAGA ATCAGTTGTT GGCTGGGTTC 
GTGATAAGAA AATCACAGCC TACCGTATCT CTAAAGAAAC CAATGCGCGT GAAATGTCTA 
TCATTCCTCT GGCGCAGGGT CGTGCAAAAG TAAAAAATAT TTCATTTGAA ACAGCCCTAG 
GCCTAATTGA TTTCTATGAA AAAAATTATG AAAAATTTGA AGATTAATCT TTGGATAACG 
GCGGATTCTT GACCTTCAAG TAGTAGAGAT AGAGAATCTC CCTTTTCATT TTGAGGACAG 
CAAAAAGACT GCACGGTTGA TGCAGCCTTT TCTTTTTATT TGAGATAGCG TTGAAGGAAC 
TCTTTTGTTC GGTCTTCTTT AGGATTGGTG AAGAGGTCTT CTGGTTTACC TTCTTCAGCG 
ATCACGCCCT TATCCATAAA GATAACACGG TGAGAGACAT CACGGGCAAA TTCCATTTCA 

i 

TGGGTTACGA CAATCATGGT CAAGCCTTCC TGAGCCAGGT CCTGCATGAT TTTGAGGACT 
TCTCCAACCA TTTCTGGATC GAGAGCTGAT GTTGGTTCAT CAAAGAGAAT AGCGTCCGGA 
TTCATGGAGA GGGCACGAGC GATGGCCACA CGTTGTTTTT GACCACCTGA GAGTTGTTTT 
GGTTTGCCTT GCCAGTAGCG TTCTCCCATG CCGACCTTTT CCAGGTTTTC TTTGGCAATC 
TTTTCAGCTT CTGTGCGTTC GCGTTTTAGG ACAGTTGTCT GAGCGACGAT TGTGTTTTCA 
AGAACATTGA GATTTTCAAA GAGGTTAAAG GATTGGAAAA CCATCCCCAA CTTTTCACGG 
TATTGCGTGA GCTCATAGCC TTTTTCGAGG ACGTTTTGTC CATGATAAAG GATTTGTCCA 
TCAGTTGGTG TTTCAAGTAG GTTAATGGAG CGTAGGAAGG TCGATTTTCC GCTTCCAGAG 
CTTCCGATGA TAGAGATGAC CTCTCCCTTG TGGACAGTGA GTGAAATGTC TTTTAGCACT 
TCGTTTTGTC CATAGGATTT TTTGAGGTGT TTAATTTCAA GGATTGCTTG TGTCATTATT 
TCAAATCCTC CGTTTGCATT TGGTTAGCAC CTCTAGTCTA GGTATCCATG TCCATTCTGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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GCTCGATAAA 


GCGTAGGATA 


CGTGTTACGG 


TGAAGGTGAG 


GACAAAGTAA 


ATCACGGCGA 


1200 


TGATTGTAAA 


TGTCTGGAAG 


TATTGATAGG 


TTTGTGTTGC 


CACGGTATTT 


CCTCAGAAAT 


1260 


AAAGTTCGAC 


AACAGAGATA 


ACGTTCAATA 


CAGATGTATC 


TTTGATATTG 


ATGACAAATT 


1320 


CATTACCAGT 


TGCAGGTAGG 


ATGTTACGGA 


CTACCTGAGG 


TAGGACAATC 


TTACGCATGG 


1380 


TCTGGTTATG 


GGTCATACCA 


AGAGCAGTCG 


CAGCTTCAAA 


TTGTCCCTTG 


TCAACTGCTA 


1440 


GGATACCACC 


ACGGACGATT 


TCAGTCATGT 


AGGCACCGGT 


ATTGATTGAA 


ACGATGAAGA 


1500 


TAGCAGCCAG 


TGTACGGTCA 


AGGTTGATCC 


CGAAAGCTTG 


GGCAGTTCCA 


TAGTAGATAA 


1560 


CCATCGATTG 


AACAATCATT 


GGCGTACCAC 


GGAAAATTTC 


AATGTAGACA 


TTGAGAACCC 


1620 


AGCCGACTAG 


TTTTTGTAGG 


CCGTAAATGA 


CTTTGTTTTC 


AGAGAGAGGA 


GCAGTACGGA 


1680 


AGACACCAAT 


GGCAAGTCCA 


ATAATGAGAC 


CTATGATGGT 


TCCGACGATA 


GAGATTAAAA 


1740 


GAGTGATACC 


AGCACCACGC 


AAGAGTTGTT 


GCCAGTTTTC 


AGAAAGAATT 


TTAGCAACTT 


1800 


GGCTAAAGAA 


ACTACTGCTA 


GTCTCTTCAG 


TTGTTGTAGC 


TTCGGCAGGT 


TCTTCCTTGA 


1860 


TCATACGATC 


CATCAAGGCA 


ACTTGGTCAT 


CTTTTGAAAT 


GGTTTCAATG 


CTGGCATTGA 


1920 


TTTGGCTAAT 


ACGATTGTCA 


TTTTTACGAA 


GCCCGATAGC 


GATAGCTGTA 


TCTTCTTCCC 


1980 


CAGTTTTGAA 


ACCAGGTTCT 


ACTTGAATCA 


TCTTGAACTT 


AGAGTTCGCA 


GCTTCAGCAG 


2040 


TCAGTGCTTC 


TGGACGTTCA 


GAAACATAAG 


CATCAATGAC 


ACCAGCCTCA 


AGAGCTTGTC 


2100 


GCATTTGAGC 


GAAGTCTCCC 


ATGGCTGTTT 


CTTTTTTAGC 


ACCTGGGATT 


TGTGCAATCA 


2160 


AGTTATAAAG 


GTAGACCCCT 


TGTTGAGAAG 


TGATTTTTGC 


ACCGTTAAAG 


TCATCCAAAG 


2220 


ATTTAGCACT 


TGCGTAGGCA 


GAATCTTTTT 


TGACAAGCAA 


AACTGGTTCG 


CTAGTATAGT 


2280 


AACTGCTCGA 


AAAGGCAATT 


TCTTGTTTGC 


GTTCTGCAGT 


TGGACTCATA 


CCTGCGATAA 


2340 


TCATGTCAAT 


CTTACCAGAA 


GTAAGGGCAG 


GGACTAGACC 


TTCCCACTTG 


GTTTTAACAA 


2400 


CCAAAGGTTC 


TTTACCTAAG 


TCCTTAGCGA 


TTTTCTTGGC 


GATTTGAACA 


TCGTATCCGT 


2460 


TGGCATACTG 


ATTGGTCCCA 


TCGATTTTGA 


CAGCTCCGTT 


GCTATCATCA 


TCCTGGGTCC 


2520 


AGTTAAAGGG 


AGCATATGCT 


GCTTCCATAC 


CGATGCGTAA 


ATATTCATCG 


GCTTGAGCAA 


2580 


CATTGACAAG 


TCCTAGCATC 


AGCAAGAGAC 


TTGTGAAAAT 


AGATAACTAy 


ATGTGGCTCA 


2640 


TGATTTCTCC 


TATTCTGATC 


TATTAAAAAA 


TAACTGTCTC 


CTATTTTATC 


GAAAAATGCG 


2700 


TAATTTTTCA 


ACATAAGTAA 


GTCTTTACTT 


ACGAAAAAAT 


GCTATAATGA 


TAAGAAAGAT 


2760 


AAAAAGGGGG 


CTTAGTTGAT 


GAAAAAAACT 


TTTTTCTTAC 


TGGTGTTAGG 


CTTGTTTTGC 


2820 


CTTCTTCCAC 


TCTCTGTTTT 


TGCCATTGAT 


TTCAAGATAA 


ACTCTTATCA 


AGGGGATTi'G 


2880 
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TATATTCATG CAGACAATAC GGCAGAGTTT AGACAGAAGA TAGTTTACCA GTTTGAGGAG 2940 

GACTTTAAGG GCCAAATCGT GGGACTTGGA CGTGCTGGTA AGATGCCTAG CGGGTTTGAC 3000 

ATTGACCCTC ATCCAAAGAT TCAGGCCGCC AAAAACGGTG CAGAACTAGC AGATGTGACT 3060 

AGCGAAGTAA CAGAAGAAGC GGATGGTTAT ACTGTGAGAG TCTATAATCC AGGTCAGGAG 3120 

GGCGACATAG TTGAAGTTGA CCTCGTCTGC AACTTAAAAA ATTTACTTTT CCTTTATGAT 3180 

GATATCGCTG AATTAAATTG GCAACCTCTG ACAGATAGTT CAGAGTCTAT TGAAAAGTTT 3240 

GAATTTCATG TAAGGGGAGA CAAGGGGGCT GAAAAACTCT TTTTCCATAC AGGGAAACTT 3300 

TTTAGAGAGG GAACGATTGA AAAGAGTAAC CTTGATTATA CTATCCGTTT AGACAATCTT 3360 

CCGGCTAAGC GTGGAGTTGA GTTGCATGCC TATTGGCCTC GGACCGATTT TGCTAGCGCT 3420 

AGGGATCAGG GATTGAAAGG GAATCGTTTA GAAGAGTTTA AT AAGAT AG A AGACTCGATT 3480 

GTTAGAGAAA AAGATCAGAG TAAACAACTC GTTACTTGGG TCCTCCCTTC CATCCTTTCC 3540 

ATCTCCTTGT TATTGAGTGT CTGCTTCTAT TTTATTTATA GAAGAAAGAC CACTCCTTCA 3600 

GTCAAATATG CCAAAAATCA TCGTCTCTAT GAACCACCAA TGGAATTAGA GCCTATGGTT 3660 

TTATCAGAAG CAGTCTACTC GACCTCCTTG GAGGAAGTGA GTCCCTTGGT CAAGGGAGCT 3720 

GGAAAATTCA CCTTTGATCA ACTTATTCAA GCTACCTTGC TACATCTGAT AGACCGTGGG 37 80 

AATGTCTCTA TCATTTCAGA AGGAGATGCA GTTGGTTTGA GGCTAGTAAA AGAAGATGGT 3840 

TTGTCAACCT TTGAGAAAGA CTGCCTAAAT CTAGCTTTTT CAGGTAAAAA AGAAGAAACT 3 900 

CTTTCCAATT TGTTTGCGGA TTACAAGGTA TCTGATAGTC TTTATCGTAG AGCCAAAGTT 3960 

TCTGATGAAA AACCGATTCA AGCAAGAGCG CTTCAACTCA AATCTTCTTT TGAAGAGGTA 4 020 

TTGAACCAGA TGCAAGAAGG AGTGAGAAAA CGAGTTTCCT TCTGGGGGCT CCCAGATTAT 4080 

TATCGTCCTT TAACTGGTGG GGAAAAGGCC TTCCAACTGG GTATGGGTGC CTTGACTATC 4140 

CTGCCCCTAT TTATCGGATT TGGTTTGTTC TTGTACAGTT TAGACGTTCA TGGCTATCTT 4200 

TACCTCCCTT TGCCAATACT TGGTTTTCTA GGGTTAGTTT TGTCTGTTTT CTATTATTGG 4260 

AAGCTTCGAC TAGATAATCG TGATCGTGTT CTAAATGAAC CGGCAGCTGA GGTCTACTAT 4 320 

CTCTGGACCA GTTTTGAAAA TATGTTGCGT GAGATTGCAC GATTGGATCA GGCTGAACTG 4 380 

GAAAGTATTC TGGTCTGGAA TCGCCTCTTG GTCTATGCGA CCTTATTTGG CTATGCGGAC 4440 

AAGGTTAGTC ATTTGATGAA GGTTCATCAG ATTCAAGTGG AAAATCCAGA TATCAATCTC 4500 

TATGTAGCTT ATGGCTGGCA CAGTACGTTT TATCATTCAA CAGCACAAAT GAGCCATTAT 4560 

GCTAGTGTCG CAAATACAGC AAGCACCTAC TCTGTATCTT CTGGAAGTGG AAGTTCTGGT 4620 

GGTGGCTTCT CTGGAGCCGG AGGTGGCGGC AGTATCGGTG CCTTTTAAAG AGAGCTACCA 4680 
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TAGACTGAAA 


AAGTATGATA 


TAATGGAAGA 


TAGAAAAAAG 


ACAAACTATA 


AGAAAAGTCA 


4740 


ATAGTTTTAT 


CTAAACTATT 


TCTTATTTCA 


ATTTGATGAT 


TTGGCGATGA 


TTTTAGAGCA 


4800 


CGGCAAAAAG 


CCCTTGAAAA 


AGTCCATTTT 


TTCAAAGGTA 


ATCCTGTGTT 


AATTTCAGAA 


4860 


ATTACATCAC 


TTTTTGTTCG 


TCAAATGGCA 


GCTCTTTTTT 


AGGATATAAA 


ACAGGGTTCG 


4920 


GATAAGTTTT 


TTTGCAAGGT 


GGATGATGCC 


TACATTGTAA 


TGTTTTCCTT 


ATTCTAACTT 


4980 


AGTCTTAAGA 


TAGGCCTTAG 


AAGCAGGTGA 


AAAGCGAGGG 


CATGCTTTGG 


CAGCTTGTAT 


5040 


GAGTGCCCAC 


CGCAGATGAG 


GGGAACCCCG 


TTTGACCATT 


CTTCCAGCTA 


AATCAATCTG 


5100 


ACCTGACTGA 


TAAATAGAAG 


AATCCAGTCC 


AGCGAAAGCT 


TGTAATTGAG 


CAGGATTATC 


5160 


AAAGGCATGA 


ATATTTCGAA 


TCTCGGCTAA 


AATGACCGCC 


CTAAACGATC 


CCCAATCCCA 


5220 


GTAACCGTCG 


TGATGACCGA 


GTTGAACTCA 


GCCATCGAGT 


CATTGATACA 


TGTTTCCGCC 


5280 


TTGTCAATGA 


GCCTCTTGTA 


ATGCTTGATG 


ATTTCGAATT 


CACGAGCAGG 


AGATGTTGTT 


5340 


CCGATAGAAC 


GAGGTGCGAC 


TGAGAGGATA 


TCCTGAATTT 


TAGAAGCGGT 


CAATCGCTTA 


5400 


ATTTCTATCA 


GCTTATCAAA 


TCCTGCCTCA 


ATCCTTTTCT 


GAGGATTAGG 


GTAGCGTGTC 


5460 


AAGAGTTGCT 


AGGTATATTC 


tgaatgcttt 


CCAACGATTT 


TATCCAACTC 


AGGAAAGATG 


5520 


ATATCAAGAC 


AACGAGTGTA 


TTGTACTTTC 


CAATCAGACT 


CTTTTTCTTG 


AGACGATGAA 


5580 


TATGTCTAGC 


CAGTATTTTT 


AGGTCTACTT 


GCCGATTATC 


GTGTTGAAAT 


TGTTCACGAT 


5640 


TGGGGTCAGA 


AAGAAGTTTA 


AGAGCGATGC 


CATGAGCGTC 


TTTCTTATCC 


GTTTTAGTCT 


5700 


TGCCAAGTCA 


TAATGATTTG 


GCAAATTCCT 


TGATGAGCAA 


AGGATTGTAG 


GTGTAAACTT 


5760 


TATATCCTTG 


TTCATGCAGG 


AAGTTCAGTA 


GATTAAAGGC 


ATAATGTCCA 


GTATCTTCAA 


5820 


GAGCGATGAG 


ACAGTCTTGG 


TTGATCTGTC 


GAATAGACAG 


ATCTAAGAGT 


TCAAAACCAG 


5880 


CTTTATTATT 


TGAAAAAGTG 


AGTGGTTTAA 


GAACAGTTTT 


TCCTGGAACA 


TTCAAGGCTG 


5940 


TAACATCGTG 


TTTATTTTTA 


GCGATATCAA 


TGCCTACATA 


AAGCATGGCA 


GTACCTCCAG 


6000 


ATATAGTATT 


TCAAGTCTAC 


TTGGTTATCC 


ACGAATTTTT 


TGCCTTGTTA 


CCTTAGACGA 


6060 


GATCAAACGT 


CTATGCGTTA 


TCAAACTCAT 


TACCAATTGA 


AACAAAAGCT 


GTGGTTAGAG 


6120 


CCTTTCGGAA 


ATCGTCAACC 


GATTGGAGGA 


AATGAACTAA 


TCCATAGTGG 


CTTATTCCAA 


6180 


GTATACCACT 


TGGGCTTTGG 


CAGTAGCTAA 


CTGCGCTAAA 


TATAATATAG 


GGAGTAATCT 


6240 


ATGTATCTTA 


TTGAAATTTT 


AAAATCTATC 


TTCTTCGGAA TTGTTGAAGG 


AATTACGGAA 


6300 


TGGTTGCCGA 


TTTCCAGTAC 


AGGTCACTTG 


ATTTTAGCAG 


AGGAATTCAT 


CCAATACCAA 


6360 


AATCAAAATG 


AAGCCTTTAT 


GTCCATCTTT 


AATGTCCTGA 


. TTCAGCTTGG 


TGCTATTTTA 


6420 
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GCAGTTATGG TGATTTATTT TAACAAGCTC AATCCTTTTA AACCGACCAA GGACAAACAG 6480 

GAAGTTCGTA AGACTTGGAG ACTATGGTTG AAGGTCTTGA TTGCTACTTT ACCTTTACTT 6540 

GGTGTCTTTA AATTTGATGA TTGGTTTGAT ACCCACTTCC ATAACATGGT TTCAGTTGCT 6600 

CTCATGTTGA TTATCTACGG GGTTGCCTTC ATCTATTTGG AAAAGCGCAA TAAAGCGCGT 6660 

GCTATCGAGC CAAGTGTAAC AGAGTTGGAC AAGCTTCCTT ATACGACCGC TTTCTATATC 6720 

GGACTCTTCC AAGTTCTTGC TCTTTTACCA GGGACTAGCC GTTCAGGTGC AACGATTGTC 67 80 

GGTGGTTTGT TAAATGGAAC CAGTCGTTCA GTTGTGACAG AATTTACCTT CTATCTTGGG 6840 

ATTCCTGTTA TGTTTGGAGC TAGTGCCTTA AAGATTTTCA AATTTGTGAA AGCCGGAGAA 6900 

CTCTTGAGCT TTGGGCAATT GTTTTTGCTC TTGGTCGCGA TGGCACTAGC TTTTGCGGTC 6960 

AGCATGGTGG CTATTCGCTT CTTGACCAGC TATGTGAAAA AACACGACTT CACCCTTTTT 7020 

GGTAAATACC GTATCGTGCT TGGTAGTGTT TTGCTACTTT ACAGTTTTGT CCGTTTATTT 7080 

GTATAAGAAA AACCTTGAAG GGGCAACTCT TCAAGGTTTT ATACTCTTCG AAAATCTCTT 714 0 

CAAACCGCGT CAGCTTTATC TGCAACCTCA AAACAGTGTT TTGAGCAGCn CTGCGGCTAG 7200 

CCTCCTAGTT TGCTCTTTGA TTTTCATTGA GCTTTAAAAT CCAGTCATGG TAATCCCCAA 7260 

TAGGCGGACA CCTCTTTCTT TCTTGCTTAA TTCTTCATAG AGTTGCAGGG CTATTTGGCT 7320 

TATCTGACTA GCATCTTGTG TTTTTTGAGC AAGACTTTTT CGTTTGGTAA GAGTTGAAAA 7380 

GTCCTCGTAG CGGATTTTCA AAATGACAAT TTTTCCAGCT TTTTCTTGTT GATGTAGATT 7440 

GAGAGCGACT TTTTCTGATA GAAGAGTCAG CTCTTTTTTG ATATCTTCCT CAGCAAGGAG 7500 

AATCTTCCCG TAGGTTTTCT CCTTGCCGAT TGATTTACGG ATGCGATTGG ATTTGACTGG 7 560 

AGAGTTGTGA ATGCCACGAG CCTTTCGATA CAGATCATAG CCTAGTCTAC CAAAACGGTC 7620 

TATTAGGGTT ACCTCAGGAA CTTCAAGTAA ATCAGCACCA GTAAAAACGC CCATTTGATG 7680 

AAGACGTTCT ACTGTCTTTT TTCCTACTCC ATGAAATTTG GAAATATCCA TTTGTTTGAG 774 0 

AAAATCCTCA GCCTGTTCAG GTAGAATCAC TGTCAAACCA TGTGGTTTTT GATAATCACT 7900 

CGCCATTTTA GCTAAGAATT TGTTGTAAGA AACGCCTGCG GAAGCAGTTA GATGGAGTTC 7860 

TTGCCAGATA TCTTTTTGAA TGAGGCGAGC AATTTTGACC GCTGACTTGA TACCGAGTTT 7920 

ATTTTCTGTC ACATCCAAAT AGGCTTCGTC AATGCTCATG GGTTCAATCA AATCTGTATA 7980 

GCGCTTAAAA ATAGCTCGAA TCTGGAGTCC CACAGACTTG TATTTCTCAT AATTCCCTGA 8040 

GATAAAGACA GCCTGGGGAC AACGTTCATA AGCTTCCTTG GAACTCATGG CAGAATGGAC 8100 

ACCAAAAGCT CTTGCCTCAT AACTACAGGT AGAAACGACT CCCCGTCCAC CTGTTTGCCG 8160 

AGGGTCGCTT CCAATAATGA CAGGTTTTCC TCTGAGTTTA GGATTATCCC TGATTTCCAC 8220 
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TGCAGCAAAA AAGGCATCCA TGTCAATATG GATGATTTTT CTTGACAAAT CATTTAACAA 
AGGAAAAATC AACATGCCTA GCACCTTTTT ATACTCTTCG AAAATCTCTT CAAACCACGT 
CAGCTCTATm TGCAACCTCA AAACAGTGTT TTGAGCAATC TGCGGCTAGC TTCCTAGTTT 
GCTTTTCGAT TTCCATTGAG TGTTACTGCT TATTyTCTTT TATTATACCC TTTTTTCTGA 
AAAAAAGAAA AAAGGACTTT ATTTTTTCAA AAATATAATA CAGTTTGAAA TAAAATATAG 
ACTGTTTTAG AAAAGAAAGT GTAAAAATAG GGAATTTTCA CTTGTTGAAA TCGGTTACTA 
TATGGTATAC TTGTCTTATG AATGTAACAG ATGACTGTTA CTAGAAAAAA GAGGACATTA 
ATATGGTTGT TAAGACAGTT GTTGAAGCAC AAGATATTTT TGACAAAGCT TGGGAAGGCT 



TCAAAGGCGT 

CACCTTATGA 

AGAAAATTGT 

GTCCAACATC 

TCGGTATCCA 

TGGCTGAAAC 

TCACTAAATA 

GTCGCGCTCG 

TCATCGGTGT 



AGATTGGAAA 

TGGAGACGAA 

AGAAGAAACT 

TATCGCTGAT 

AAACGATGAA 

TACTTTGAAA 

TGTAACAACA 

TCACGCACAC 

TTACGCACGT 



GAAAAAGCAA GTGTATCACG ATTTGTACAA GCTAACTACA 
AGCTTCCTTG CAGGACCAAC AGAGCGTTCA CTTCACATCA 
AAAGCACACT ACGAAGAAAC TCGTTTCCCA ATGGACACTC 
ATCCCTGCTG GATTTATCGA CAAAGAAAAT GAAGTTATCT 
CTCTTCAAAT TGAACTTCAT GCCAAAAGGT GGTATCCGTA 
GAAAATGGAT ACGAACCAGA CCCAGCTGTT CACGAAATCT 
GTTAACGACG GTATTTTCCG TGCCTACACT TCAAATATTC 
ACTGTAACTG GTCTTCCAGA TGCATACTCA CGCGGACGTA 
CTTGCTCTTT ACGGTGCAGA CTACTTGATG CAAGAAAAAG 
AAAGAAATCG ATGAAGAAAC AATCCGTCTT CGTGAACAAG 
TTGCAACAAG TTGTTCGCCT GGGTGACCTT TACGGGGTTG 



TAAATGACTG GAATGCAATC 
TAAACCTTCA ATACCAAGCA 
ATGTTCGCAA ACCAGCGATG AACGTGAAAG AAGCAATCCA ATGGGTTAAC ATTCCTTTCA 
TGGCTGTCTG CCGTGTGATT AACGGTGCTG CTACATCTCT AGGTCGTGTA CCAATCGTAT 
TGGACATCTT TGCAGAACGT GACCTTGCTC GTGGTACATT TACTGAATCA GAAATCCAAG 
AATTCGTTGA TGATTTCGTT ATGAAACTTC GTACAGTTAA ATTTGCTCGT ACAAAAGCTT 
ATGACCAATT GTACTCAGGT GACCCAACCT TTATCACAAC TTCTATGGCT GGTATCCGTA 
ACGACGGTCG TCACCGTGTT ACTAAGATGG ACTACCGTTT CTTGAACACT CTTGACAACA 
TCGGTAACTC ACCAGAACCA AACTTGACAG TTCTTTGGAC TGACAAATTG CCATACAACT 
TCCGTCGCTA CTGTATGCAC ATGAGCCACA AACACTCTTC TATCCAATAC GAAGGTGTAA 
CAACAATGGC TAAAGACGGA TATGGTGAAA TGAGCTGTAT CTCATGCTGT GTGTCTCCAC 
TTGATCCAGA AAATGAAGAA CAACGCCACA ACATCCACTA CTTCGGTGCT CGTGTAAACG 
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TTCTTAAAGC CCTTCTTACT GGTTTGAATG GTGGTTACGA CGATGTTCAC AAAGACTACA 10020 

AAGTATTTGA TATCGAACCA ATCCGTGACG AAGTTCTTGA ATTTCAATCA GTTAAAGCGA 10080 

ACTTTGAAAA ATCTCTTGAC TGGTTGACTG ACACTTACGT AGATGCCTTG AACATCATCC 10140 

ACTACATGAC TGATAGGTAC AACTACGAAG CTGTTCAAAT GGCCTTCTTG CCAACTAAAC 10200 

AACGTGCCAA CATGGGATTC GGTATCTGTG GATTTGCTAA CACTGTTGAT ACATTGTCAG 10260 

CTATCAAATA CGCTACAGTT AAACCAATCC GTGACGAACA TGGCTACATC TACGATTACG 10320 

AAACAATCGG TGACTACCCA CGCTGGGGTG AAGATGACCC ACGTTCAAAC GAATTGGCAG 10380 

AATGGTTGAT CGAAGCTTAC ACAACTCGTC TACGTAGCCA CAAACTATAC AAAGACGCAG 10440 

AAGCTACAGT ATCACTTTTG ACAATCACAT CTAACGTTGC TTACTCTAAA CAAACTGGTA 10500 

ACTCACCAGT TCACAAAGGT GTAT ACCTC A ACGAAGATGG TTCTGTGAAC TTGTCTAAAC 10560 

TTGAATTCTT CTCACCAGGT GCTAACCCAT CTAACAAAGC TAAAGGTGGT TGGTTGCAAA 10620 

ACTTGAACTC ACTTTCTAGC CTTGACTTTA GTTATGCAGC TGACGGTATC TCATTGACTA 10680 

CACAAGTATC ACCTCGCGCT CTTGGTAAGA CTCGTGATGA ACAAGTTGAT AACTTGGTAA 10740 

CAATTCTTGA TGGTTACTTC GAAAACGGTG GACAACACCT TAACTTGAAC GTTATGGACT 10800 

TGAACGATGT TTACGAAAAA ATCATGTCAG GCGAAGACGT TATCGTACGT ATCTCTGGAT 10860 

ACTGTGTAAA CACTAAATAC CTCACTCCAG AACAAAAAAC TGAATTGACA CAACGTGTCT 10920 

TCCACGAAGT TCTTTCAATG GATGACGCCT TGGATGCATT GAGCTAATCA AGTTCTTGAA 10980 

TAATAAAAAG GAACCCTCGG TCAAACGACT GAGGGTTTTG TGCTTGGGAT AGTATGAGCA 11040 

ATTCCTTCGG CGCAATATGC AATGTTTTTG GCCTCTTTGT CAACTGTAGT GGGTTGAAAA 11100 

AAAGCTAACC TTGAGAAAGG ACAAATTTCG TCCTTTCTTT TTTGATGTTC AGGGCGATAA 11160 

AAATCCGTTT TTTGAAGTT7 TCAAAGTTCC GAAAACCAAA GGCATTGCCC TTGATGTCTT 11220 

TGATGAGTTT GTTAGTGGCC TCAAGTTTAG CCTTAGAATA AGGCAATTCA ATGGCGTTAG 11280 

TGATGTAGTT TTTATAGCAA ATAAATGTGC TCAAAGTGGT TTTAAAGGTG CGGTTGAGAT 11340 

GAGGTAACGT GTCTTGAATT AAGCCCCAAA ACTGGTCAGT ATTCTTCTCT TGTACATGAA 11400 

ATAGGAGTAG TTGATACAGG TCATAGTAAT CTTTAAGTTC AGGTACTAGA GTAAAGATTT 11460 

TCTTCAGACA CTCCCTAGGA GTTAAGGTCT CTCTGAAAGT TC TAG CAT AG AAAGGCTTAA 11520 

GAGAGAGTTT CCCACTATCT TTTAGGATAA ATTTCCAGTA ATATTTAAGA GCTCTGTATT 11580 

CCAGAGATTT ATCATCAAAT TGCTTCATGA TGTTGATTCT AGTCTGATTA AGAGCCCTGC 11640 

TCATCTGTTG GACAATGTGG AAACGATCGA GAACAATTTT AGCATTGGGA AATAATTTCT 11700 

TAATGAGAGG GATATAACTT CCAGACATAT CAACAGTGAC GACTTTAACT TTTTTTCTAG 117 60 
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CTTCTTTCGA 


GTACTTGAAG 


AAATGATTTC 


GGATGGTTGT 


TTGACGTCTG 


TTATCAAGAA 


11820 


TGGTCATGAT 


TTTCTTAGTG 


TTGAAATCCT 


GAGCAATGAA 


AGCCAATTTC 


CCCTTCTCGT 


11880 


AGGAGAATTC 


ATCCCAGGAG 


AGGATTTCAG 


GCAAACTGGT 


GTAATCCTCT 


TGGAAATGAA 


11940 


ATTGCTTGAG 


CTTACGATAG 


ACCGTAGAGG 


TAGAGGTAGA 


GGTAGAGATG 


GCTAATTTAG 


12000 


AAGCGATATG 


TGTAAGAGCC 


TCTCTGTTGA 


GTAGCAGTTG 


GGCAATTTTC 


TGTCTCACCA 


12060 


TTTCCGAGAT 


TTGGCAATTT 


TTCTGAACGA 


GAGTTGTTTC 


AGCTACAGTG 


ACTTTCCGAC 


12120 


AGGACTTGCA 


TTGAAATCGT 


CTCTTTTTCA 


AATGAATGAG 


GCTAGGGAAA 


CCACCAATCT 


12180 


CGATAAAAGG 


GATTTTAGAA 


GGCTTTTGGA 


AGTCGTATTT 


GATTTGTTTT 


CCTTTACAGT 


12240 


GTTTACATTT 


AGGTGGGTGA 


TAATCAAGTG 


TAGCGAAGAC 


TTCGATATGG 


GTATCGTGCT 


12300 


GAATGGCTTT 


ATTTAAGGTG 


ATGTTTTTGT 


CTTTTATTCC 


GATGAGTAAT 


GTGGTATGAT 


12360 


TGATGTGTTC 


CATAAGATAC 


TTTCTAATGA 


GTTGTTTAGG 


CGCTTTTCAT 


TATAAGTCTT 


12420 


ATGGGACTTT 


TTTGATACTC 


AAAAAGCCCT 


ATAATCTCCA 


CAGTGGGATT 


TACCCACTAC 


12480 


AGAAATTATA 


GAGCCAGAAA 


AAACACTTTT 


GTTCACTAGC 


AGAAACTAGA 


GACCAGAAGT 


12540 


GTTTTTCTGT 


TCAGATTTAC 


CCAAAACTGG 


GAAATATGGG 


GATAAGAATA 


GAGATGGCTT 


12600 


AGGAAGCCCC 


TTTTTGTGTG 


TAGACAGTAC 


GATGAACTTA 


TAACAAATAG 


TGAGCCTTTT 


12660 


TAGCAATCAT 


TGCGACCCGT 


TTGTCAAAAG 


CCTCTTTTCG 


GATATCTACA 


ATTGTCTGAT 


12720 


AGATGAGACG 


CTGTTCCCTA 


ACATGCAAAT 


CTAAGGCAAT 


CGTCAAAAAG 


TGATGTTTCC 


12TB0 


CTTTGGGATA 


CTCCTTTTTA 


ACGTAAGGCA 


GGTATTCTTT 


CGTTCTAATA 


ATAATCAATG 


12940 


GCTCTGTCAA 


ATGCTCCTCT 


GAAGGAGGAG 


GACTAATTAG 


AATATTGTAT 


CCTGTAACAG 


12900 


AGGCAACTTT 


GTCAGTAAAA 


TTCCGTAAAA 


TAATGGACTT 


TATTAAGTTT 


ACATCTGCTT 


12960 


GATTATTTAA 


AATGATAAAA 


ATCGGGATAG 


CAGGTAGTGA 


GGAAAAGATG 


GTTTCTGTCA 


13020 


AGTAGAGTGA 


GAAAAGGTAC 


AGCCGATGCT 


GGTCGATAAC 


TCCTTCAATC 


TTCTGCTCAG 


13080 


TCATCCACTC 


TTGAACAATT 


GCTTTCGAAA 


TATGATACAG 


TGGCTTGTCG 


CTTTCAATCC 


13140 


CATAATGTTC 


GTAATAATTA 


TAATAGGGAA 


CTAGATTTTG 


TAAACCAAAC 


AAAAACGTTC 


13200 


TTGTTAAGAA 


AGTCAGTGCT 


GTTAAAAAAG 


AAAGACAATT 


CGAAATGTCA 


TTTCCTAAGA 


13260 


TATTCTTGAA 


CTTGGATAGT 


AGATGCTTTC 


CTCTTGTATG 


CTGAAGAATC 


AGTTGAATAG 


13320 


TATGAGTCTT 


TTTTTCTTGA 


TTCCATTTGT 


CCTTGGAAAA 


CGAAGAATTA 


GCAGAACAAT 


13380 


AAACCAAAAA 


GATATAATCC 


AGTTCTTCCT 


GAGTAAAAGT 


CATGTTGGCA 


TGTGGCTCTA 


13440 


AGTAAGTTTG 


GCAATGTTCC 


ATCAAAATCG 


GATACATAAA 


GAGGTTTTTT 


AATTTTTCAA 


13500 
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ACTCTTTGGA CTCAGGGAAC TCAAGTGGAA ATTCCCGACG TTTCCAAGTG AGTGCCACTA X3560 

GTATGCTAAA ATGAACATAC TCGTCAGGTG TGATTTCTAA CAGTTCATGA CTGAGTTGAG L3620 

AATTAGACTG CACAATCATA TGTGTGACCC AATCCATACT TCCATCATTC AAATCATAAA 13680 

TCTCAATACC AAAATGAAAC TGGAGGAGTG CAATTAAAAA ACGAATGCGA TATTCAGGAC 13740 

CAACTACTTG ATTTTTCACA AGGTCCAAAC CTACTGAACG TAGTAACAAG CCACACTTTT 13800 

GTCGTACGCG GTACCCTGTT GCGATGGAAA TATACTCTTT TTGTGTAAAT TCGTTAAAGC 13860 

TTTGATTACC TTGTAGTAGA AAGAAGCGGA GTATTTTTAA AATAGTTGAT TGGTTATAAA 13920 

GCTGATGGAA GTAATAATTC GTTTGATGAG AATGGTGTTC CATTAATTGA ACTTGTTGCG 13980 

TATCTAAATT AAATGTCAAC TCTTCCTCGA ATGTTTCTTG TAATTCCTGC AAAATGCTTA 14040 

GGAGACTTTT AGATTGTAAT GAAGTTAAAG TAGACAGTTC ATCTAGTTCA ATAGACCGAA 14100 

TATCCAATAA TATATTTAAA ATGGTAATTT TATCTGTAAT TCTTTTTTCA ATGTATTTGT 14160 

TTAGCATACT TACCGAATCT TAGTTGCATA TAGATAATTT TAATTATTAT AATACAAAAG 14220 

AAACTAATTG TCTTGTCAAA AAGGTTGTGG AATTTCCGAC TTTATTGATA AAACAGCATG 14280 

TAATAAAAGG CATTTTAAAG ATAGTAATGA GTATTGGTGG AGTTTTATCC CTTATTTTTT 14 340 

TTATTAGAAA ATATTTTTTT ATCAAATATT GTCGTTCTAT AAAAAAATAT GTGATAAAAA 14400 

TATCTATTGT GATGGAAGTT GTTTTAATTT ATACTAGGAT AGTTAATAGT AATACTATAC 14460 

TATACTATAT TGTATACAAG TGTGTCATTG CCAGGTTGAG AAGATAGCTA TAACGCACTT 14 520 

TTATACGCTT TTGCTACGTT TGTTAGTGAA CGGATTAACT CAGTGAGATA AATTTTATCA 14580 

GAACATAAGT AATCCGTTTC TTCGTGTATA CAGATTGAAA GTACCTATGA ATCATAGAAG 14640 

GATTAACTTG TTCTATGAAT AATGCTTAAC AGGGAGACAC ACATGAAAAA AGTAAGAAAG 14 700 

ATATTTCAGA AGGCAGTTGC AGGACTGTGC TGTATATCTC AGTTGACAGC TTTTTCTTCG 14760 

ATAGTTGCTT TAGCAGAAAC GCCTGAAACC AGTCCAGCGA TAGGAAAAGT AGTGATTAAC 14820 

GAGACAGGCG AAGGAGGAGC GCTTCTAGGA GATGCCGTCT TTGAGTTGAA AAACAATACG 14 880 

GATGGCACAA CTGTTTCGCA AAGGACAGAG GCGCAAACAG GAGAAGCGAT ATTTTCAAAC 14 940 

ATAAAACCTG GGACATACAC CTTGACAGAA GCCCAACCTC CAGTTGGTTA TAAACCCTCT 15000 

ACTAAACAAT GGACTGTTGA AGTTGAGAAG AATGGTCGGA CGACTGTCCA AGGTGAACAG 15060 

GTAGAAAATC GAGAAGAGGC TCTATCTGAC CAGTATCCAC AAACAGGGAC TTATCCAGAT 15120 

GTTCAAACAC CTTATCAGAT TATTAACGTA GATGGTTCGG AAAAAAACGG ACAGCACAAG 15180 

GCGTTGAATC CGAATCCATA TGAACGTGTG ATTCCAGAAG GTACACTTTC AAAGAGAATT 15240 

TATCAAGTGA ATAATTTGGA TGATAACCAA TATGGAATCG AATTGACGGT TAGTGGGAAA 15300 
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ACAGTGTATG AACAAAAAGA TAAGTCTGTG CCGCTGGATG TCGTTATCTT GCTCGATAAC 15360 

TCAAATAGTA TGAGTAACAT TCGAAACAAG AATGCTCGAC GTGCGGAAAG AGCTGGTGAC 15420 

GCGACACGTT CTCTTATTGA TAAAATTACA TCTGATTCAG AAAATAGGGT AGCGCTTGTG 15480 

ACTTATGCTT CCACTATCTT TGATGGGACC GAGTTTACAG TAGAAAAAGG GGTAGCAGAT 15540 

AAAAACGGAA AGCGATTGAA TGATTCTCTT TTTTGGAATT ATGATCAGAC GAGTTTTACA 15600 

ACCAATACCA AAGATTATAG TTATTTAAAG CTGACTAATG ATAAGAATGA CATTGTAGAA 15660 

TTAAAAAATA AGGTACCTAC CGAGGCAGAA GACCATGATG GAAATAGATT GATGTACCAA 15720 

TTCGGTGCCA CTTTTACTCA GAAAGCTTTG ATGAAGGCAG ATGAGATTTT GACACAACAA 15780 

GCGAGACAAA ATAGTCAAAA AGTCATTTTC CATATTACGG ATGGTGTCCC AACTATGTCG 15840 

TATCCGATTA ATTTTAATCA TGCTACGTTT GCTCCATCAT ATCAAAATCA ACTAAATGCA 15900 

TTTTTTAGTA AATCTCCTAA TAAAGATGGA ATACTATTAA GTGATTTTAT TACGCAAGCA 15960 

ACTAGTGGAG AACATACAAT TGTACGCGGA GATGGGCAAA GTTACCAGAT GTTTACAGAT 16020 

AAGACAGTTT ATGAAAAAGG TGCTCCTGCA GCTTTCCCAG TTAAACCTGA AAAATATTCT 16080 

GAAATGAAGG CGGCTGGTTA TGCAGTTATA GGCGATCCAA TTAATGGTGG ATATATTTGG 16140 

CTTAATTGGA GAGAGAGTAT TCTGGCTTAT CCGTTTAATT CTAATACTGC TAAAATTACC 16200 

AATCATGGTG ACCCTACAAG ATGGTACTAT AACGGGAATA TTGCTCCTGA TGGGTATCAT 16260 

GTCTTTACGG TAGGTATTGG TATTAACGGA GATCCTGGTA CGGATGAAGC AACGGCTACT 16320 

AGTTTTATGC AAAGTATTTC TAGTAAACCT GAAAACTATA CCAATGTTAC TGACACGACA 16380 

AAAATATTGG AACAGTTGAA TCGTTATTTC CACACCATCG TAACTGAAAA GAAATCAATT 16440 

GAGAATGCTA CGATTACAGA TCCGATGGGT GAGTTAATTG ATTTGCAATT GGGCACAGAT 16500 

GGAAGATTTG ATCCAGCAGA TTACACTTTA ACTGCAAACG ATGGTAGTCG CTTGGAGAAT 16560 

GGACAAGCTG TAGGTGGTCC ACAAAATGAT GGTGGTTTGT TAAAAAATGC AAAAGTGCTC 16620 

TATGATACGA CTGAGAAAAG GATTCGTGTA ACAGGTCTCT ACCTTGGAAC GGATGAAAAA 16680 

GTTACGTTGA CCTACAATGT TCGTTTGAAT GATGAGTTTG TAAGCAATAA ATTTTATGAT 16740 

ACCAATGGTC GAACAACCTT ACATCCTAAG GAAGTAGAAC AGAACACAGT GCGCGACTTC 16800 

CCGATTCCTA AGATTCGTGA TGTGCGGAAG TATCCAGAAA TCACAATTTC AAAAGAGAAA 16860 

AAACTTGGTG ACATTGAGTT TATTAAGGTC AATAAAAATG ATAAAAAACC ACTGAGAGGT 16920 

GCGGTCTTTA GTCTTCAAAA ACAACATCCG GATTATCCAG ATATTTATGG AGCTATTGAT 16980 

CAAAATGGCA CTTATCAAAA TGTGAGAACA GGTGAAGATG GTAAGTTGAC CTTTAAAAAT 17040 
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CTGTCAGATG CGAAATATCG ATTATTTGAA AATTCTGAAC CAGCTGGTTA TAAACCCGTT 17100 

CAAAATAACC CTATCGTTGC CTTCCAAATA GTAAATGGAG AAGTCAGAGA TGTGACTTCA 17160 

ATCGTTCCAC AAGATATACC AGCGGGTTAC GAGTTTACGA ATGATAAGCA CTATATTACC 17220 

AATGAACCTA TTCCTCCAAA GAGAGAATAT CCTCGAACTG GTGGTATCGG AATCTTGCCA 17280 

TTCTATCTGA TAGCTTGCAT GATGATGGGA GGAGTTCTAT TATACACACG GAAACATCCG 17340 

TAAAGTGTAG AAATGATAAT ATCTATGTTC TGAACGATAC TTTTAAGAAG TAGCACTCAA 17400 

GAAGAGATTT AAGTTTACTT GGTGAAACCT GTTTTATTCG TAAGTAAACT ATCATTGAAA 17460 

GGGGAGATGT TTTCGAAAAC TTGCACAGAA AAAGGATTAT TATTGTCATG TGTAATTCAT 17520 

TACATTGCTC ACAGTTGATT TTAAGAGATA TGAATAAGGA GAAATCATGA AATCAATCAA 17580 

CAAATTTTTA ACAATGCTTG CTGCCTTATT ACTGACAGCG AGTAGCCTGT TTTCAGCTGC 17 640 

AACAGTTTTT GCGGCTGGGA CGACAACAAC ATCTGTTACC GTTCATAAAC TATTGGCAAC 17700 

AGATGGGGAT ATGGATAAAA TTGCAAATGA GTTAGAAACA GGTAACTATC CTGGTAATAA 17760 

AGTGGGTGTT CTACCTGCAA ATGCAAAAGA AATTGCCGGT GTTATGTTCC TTTGGACAAA 17820 

TACTAATAAT GAAATTATTG ATGAAAATGG CCAAACTCTA GGAGTGAATA TTGATCCACA 17880 

AACATTTAAA CTCTCAGGGG CAATGCCGGC AACTGCAATG AAAAAATTAA CAGAAGCTGA 17940 

AGGAGCTAAA TTTAACACGG CAAATTTACC AGCTGCTAAG TATAAAATTT ATGAAATTCA 18000 

CAGTTTATCA ACTTATGTCG GTGAAGATGG AGCAACCTTA ACAGCTTCTA AAGCAGTTCC 18060 

AATTGAAATT GAATTACCAT TGAACGATCT TGTGGATCCG CATGTCTATC CAAAAAATAC 18120 

AGAAGCAAAG CCAAAAATTG ATAAAGATTT CAAAGGTAAA GCAAATCCAG ATACACCACG 18180 

TGTAGATAAA GATACACCTG TGAACCACCA AGTTGGAGAT GTTGTAGAGT ACGAAATTGT 18240 

TACAAAAATT CCAGCACTTG CTAATTATGC AACAGCAAAC TGGAGCGATA CAATGACTGA 18300 

AGGTTTGGCA TTCAACAAAG GTACAGTGAA ACTAACTGTT GATGATGTTG CACTTGAAGC 18360 

AGGTGATTAT GCTCTAACAG AAGTAGCAAC TGGTTTTGAT TTGAAATTAA CAGATGCTGG 18420 

TTTAGCTAAA GTGAATGACC AAAACGCTGA AAAAACTGTG AAAATCACTT ATTCGGCAAC 18480 

ATTGAATGAC AAAGCAATTG TAGAAGTACC AGAATCTAAT GATGTAACAT TTAACTATGG 13540 

TAATAATCCA GATCACGGGA ATACTCCAAA GCCGAATAAG CCAAATGAAA ACGGCGATTT 18600 

GACATTGACC AAGACATGGG TTGATGCTAC AGCTGCACCA ATTCCGGCTG GAGCTGAAGC 18660 

AACGTTCGAT TTGGTTAATG CTCAGACTGG TAAAGTTGTA CAAACTGTAA CTTTGACAAC 18720 

AGACAAAAAT ACAGTTACTG TTAACGGATT GGATAAAAAT ACAGAATATA AATTCGTTGA 18780 

ACGTAGTATA AAAGGGTATT CAGCAGATTA TCAAGAAATC ACTACAGCTG GAGAAATTGC 18840 
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TGTCAAGAAC TGGAAAGACG AAAATCCAAA ACCACTTGAT CCAACAGAGC CAAAAGTTGT 18900 

TACATATGGT AAAAAGTTTG TCAAAGTTAA TGATAAAGAT AATCGTTTAC CTGGGGCAGA 18960 

ATTTGTAATT GCAAATGCTG ATAATGCTGG TCAATATTTA GCACGTAAAG CAGATAAAGT 19020 

GAGTCAAGAA GAGAAGCAGT TGGTTGTTAC AACAAAGGAT GCTTTAGATA GAGCAGTTGC 19080 

TGCTTATAAC GCTCTTACTG CACAACAACA AACTCAGCAA GAAAAAGAGA AAGTTGACAA 19140 

AGCTCAAGCT GCTTATAATG CTGCTGTGAT TGCTGCCAAC AATGCATTTG AATGGGTGGC 19200 

AGATAAGGAC AATGAAAATG TTGTGAAATT AGTTTCTGAT GCACAAGCTC GCTTTGAAAT 19260 

TACAGGCCTT CTTGCAGGTA CATATTACTT AGAAGAAACA AAACAGCCTG CTGGTTATGC 19320 

ATTACTAACT AGCCGTCAGA AATTTGAAGT CACTGCAACT TCTTATTCAG CGACTGGACA 19380 

AGGCATTGAG TATACTGCTG GTTCAGGTAA AGATGACGCT ACAAAAGTAG TCAACAAAAA 19440 

AATCACTATC CCACAAACGG GTGGTATTGG TACAATTATC TTTGCTGTAC CGGGGGCTGC 19500 

GATTATGGGT ATTGCAGTGT ACGCATATGT TAAAAACAAC AAAGATGAGG ATCAACTTGC 19560 

TTAAGTAAGA GAGAAAGGAG CCATTGATGA CAATGCAGAA AATGCAGAAA ATGATTAGTC 19620 

GTATCTTCTT TGTTATGGCT CTGTGTTTTT CTCTTGTATG GGGTGCACAT GCAGTCCAAG 19680 

CGCAAGAAGA TCACACGTTG GTCTTGCAAT TGGAGAACTA TCAGGAGGTG GTTAGTCAAT 19740 

TGCCATCTCG TGATGGTCAT CGGTTGCAAG TATGGAAGTT GGATGATTCG TATTCCTATG 19800 

ATGATCGGGT GCAAATTGTA AGAGACTTGC ATTCGTGGGA TGAGAATAAA CTTTCTTCTT 19860 

TCAAAAAGAC TTCGTTTGAG ATGACCTTCC TTGAGAATCA GATTGAAGTA TCTCATATTC 19920 

CAAATGGTCT TTACTATGTT CGCTCTATTA TCCAGACGGA TGCGGTTTCT TATCCAGCTG 19980 

AATTTCTTTT TGAAATGACA GATCAAACGG TAGAGCCTTT GGTCATTGTA GCGAAAAAAA 20040 

CAGATACAAT GACAACAAAG GTGAAGCTGA TAAAGGTGGA TCAAGACCAC AATCGCTTGG 20100 
AGGGTGTCGG CTTTAAATTG GTATCAGTAG CAAGAGATGT TTCTGAAAAA GAGGTTCCCT 20160 
TGATTGGAGA ATACCGTTAC AGTTCTTCTG GTCAAGTAGG GAGAACTCTC TATACTGATA 20220 
AAAATGGAGA GATTTTTCTG ACAAATCTTC CTCTTGGGAA CTATCGTTTC AAGGAGGTGG 20280 
AGCCACTGGC AGGCTATGCT GTTACGACGC TGGATACGGA TGTCCAGCTG GTAGATCATC 2 0340 
AGCTGGTGAC GATTACGGTT GTCAATCAGA AATTACCACG TGGCAATGTT GACTTTATGA 20400 
AGGTGGATGG TCGGACCAAT ACCTCTCTTC AAGGGGCAAT GTTCAAAGTC ATGAAAGAAG 20460 
AAAGCGGACA CTATACTCCT GTTCTTCAAA ATGGTAAGGA AGTAGTTGTA ACATCAGGGA 20520 
AAGATGGTCG TTTCCGACTG GAAGGTCTAG AGTATGGGAC ATACTATTTA TGGGAGCTCC 20580 
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AAGCTCCAAC TGGTTATGTT CAATTAACAT CGCCTGTTTC CTTTACAATC GGGAAAGATA 
CTCGTAAGGA ACTGGTAACA GTGGTTAAAA ATAACAAGCG ACCACGGATT GATGTGCCAG 
ATACAGGGGA AGAAACCCTT GTATATCTTG ATGCTTGTTG CCATTTTGTT GTTTGGTAGT 
GGTTATTGTC TTACGAAAAA ACCAAATAAC TGATATTCAA TGTACATCAT TATGAATAGG 
ATAGCAGGCT GAAGGGAAGA CCAGAGTACT CTGAGGTGAT GTTAATCAGG AATCATGGTG 
ATGTGGCATG AATCATCAAT AACGGATATC ACGCTGGGCA GATTGTGCCA GCCTCATTGT 
GGGTTATTGT TTGTAAAACG ATAGGACTGG TCTCGTAATC ATTTTA 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 21040 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<0> TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



CCCAGCAAAA 


AGCCATCCGA 


AGATGACTTT 


TTTGCTATTT 


AATTTCTGTA 


TAAGTTACTT 


60 


CCAAGCCACG 


CTTAACAGCT 


GGACGATTGG 


CAATTTTTTC 


TGCCCATTTT 


ACTAGATTTT 


120 


GATAACTTGA 


GGCATCCAAG 


AATTTTGCAG 


AACCTTGGTA 


AAGATTTCCT 


TGAACTAACT 


180 


GTCCATACCA 


AGACCAGATA 


GCAATATCTG 


CAATCGTATA 


GTCATTGCCT 


GCAATATAAG 


240 


GTTTCTGAGC 


CAATTCCTTA 


TCCAATAAAT 


CCAACTGGCG 


TTTCACTTCC 


ATCGTAAAAC 


300 


GGTTAATAGG 


ATATTCCAAT 


TTTTCAGGAG 


CATAATTGAA 


GAAATGTCCA 


AATCCCCCAC 


360 


CTAGAAAAGG 


TGCTGCACCT 


GCTTGCCAGA 


ATACCCAATT 


CAAAACTTCT 


ACCTTTTCCA 


420 


CAGGATTACT 


TGGTAAAAAG 


GCTCCAAATT 


TCTCAGCAAG 


GTAAAGAAGA 


ATATGAGCAC 


430 


ACTCAAAGAC 


TCTTACGTTT 


TCAGTACCTG 


ACTGGTCCAA 


TAAGGCTGGA 


ATCTTGGAAT 


540 


TTGGATTGAG 


CTTCACAAAG 


TCTGATCCGA 


ATTGATCCCC 


ATCCATGATA 


GCAATCTTAT 


600 


ACAAGTCGTA 


AGCCGCTTCC 


TTAAAACCAG 


CTTCTAGTAA 


TTCTTCCAAT 


AAGATAGTAA 


660 


CCTTCACACC 


ATTTGGTGTT 


CCCAGTGAAT 


AAACCTGAAA 


AGCTTGTTCT 


CCTTTTGGCA 


720 


AGTTTTGTTC 


GAAACGGGCA 


CCTGCTGTTG 


GTCTGTTTAG 


CCCCGTAAAA 


GCTCCTTGAT 


780 


TACTAGCTTC 


ATCCTGCCAT 


ACGGTCGGTA 


ATTGATATGC 


TGACATCCGA 


AACCTCCCTT 


840 


AAATCGCATT 


CTTGTCAAAA 


CCGAGTTTGC 


GTTGAATAAA 


CTTAACGATT 


TCGACGATGA 


900 


TAATCATTGA GAAGCTTCCA 


GCCATAACAA 


TTCCCCATTG 


TGACAAGTCT 


AGTTTGGTTA 


960 


CGTGGAAGAT 


TCCTTCAAGC 


GGTTCTACAA 


CGATTGTTGC 


CATGAGAAGG 


ATAAAGGATA 


1020 



20640 
20700 
20760 
20820 
208B0 
20940 
20986 
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* 



CCAAGATGGA 


CCAGTTAAAG 


GTCTTAGACT 


TGAATGGGCC 


AACTGTCAAG 


ATCCATTGGT 


1080 


AGACAGACTT 


GACATTGTAG 


GCATGGAAGA 


GCTGAATCAA 


ACCAAGGGTT 


GCAAAGGCCA 


1140 


TCGTTAGGGC 


ATCTGCATGA 


ATAGCATGAT 


TGTCACCCAC 


ATGAACTGGG 


TAAGCAATCG 


1200 


CAAGGCCATA 


AACACTCATA 


ACAAGAGCTG 


CTTGGAGTAC 


ACCTTGATAA 


ATGATAGAAC 


1260 


TCAAAACACC 


ACCTGAGAAG 


AAGCTTGCCT 


TGCGTCCACG 


TGGTTTATGA 


TTCATGACAC 


1320 


CAGGTTCCGC 


AGGTTCAACA 


CCAAGAGCGA 


TAGCTGGGAA 


GGTATCCGTT 


ACCAAGTTGA 


1380 


TCCACAAAAG 


ATGAACCGGC 


TGTAAGACAT 


CCCAACCAAA 


CAAGGTTGAT 


AGGAAGATGG 


1440 


TTAATACTTC 


AGCAGTATTA 


GCAGAAAGTA 


GGTACTGAAT 


AGTCTTTTGA 


ATGTTTGAGA 


1500 


AGACCTTACG 


TCCTTCTTCC 


ACTGCGACGA 


TAATAGTCGC 


AAAGTTATCA 


TCTGCAAGAA 


1560 


TCATATCAGA 


AGCCCCCTTA 


GAAACCTCTG 


TACCAGTGAT 


TCCCATACCG 


ATACCGATAT 


1620 


CCGCTGTTTT 


CAGAGCTGGC 


GCGTCATTGA 


CACCGTCACC 


TGTCATGGCA 


ACGACTTTAC 


1680 


CTTGTTTTTG 


CCAAGCCTTG 


ACGATACGAA 


CCTTGTGTTC 


TGGAGACACA 


CGGGCATAAA 


1740 


CAGAGTATTG 


ACCAACGACT 


TTTTCAAATT 


CTTCATCTGA 


CAGTTCATTG 


AGTTCAGCAC 


1800 


CAGTTAAAAC 


GTGACCTTCT 


GTATCGTTTG 


CGTCAATGAT 


TCCCAAACGT 


TTGGCAATGG 


1860 


CTTCCGCTGT 


GTCTTGGTGG 


TCACCTGTAA 


TCATAATTGG 


ACGGATTCCC 


GCTTCCTTAG 


1920 


CCACACGAAC 


AGCCTCAGCG 


GCTTCAGGAC 


GTTCAGGGTC 


AATCATCCCA 


ATCAAACCAG 


1980 


TAAAAATTAA 


ATCATTTTCA 


AGCTCTTCAG 


AAGTGAGATT 


TTCTGGAATA 


CTATCGATAA 


2040 


TCTTATAAGC 


ACCTGCAAGG 


ACACCCAAGG 


CTTGATCAGC 


CATTTCAGAA 


TTGTTTGTAC 


2100 


GAATGAGATT 


TGTAACCTTC 


TCATCAATCG 


GAGCAATATC 


CCCAGCCTTA 


TCACGAAGAA 


2160 


GACAACGTTT 


TAAGAGTTGG 


TCTGGCGCAC 


CCTTGACTGC 


TACAACGAAA 


CGACCATCTG 


2220 


GCAATGGGTG 


AACTGTTGAC 


ATGAGCTTAC 


GGTCAGAGTC 


AAATGGCAAT 


TCAGCTACAC 


2280 


CAGGATATTT 


CTCTAAGAAA 


CCTTTGACAT 


CATAGCCCTT 


GTCCAAGGCA 


TATTGGATAA 


2340 


AGGCTGTTTC 


GGTTGGGTCA 


CCAATCAAGT 


TACCTTCCAC 


ATCGATTTTC 


GTATCATTCG 


2400 


CCAAGACAAC 


TGAACGAAGT 


AGTGGCATTT 


CAAGACCTAG 


TTCAATATCA 


TCAGCTGAGT 


2460 


CATGTAGAAC 


CGCATCGTAG 


AAGACTTTTT 


CGACTGTCAT 


CTTGTTCATA 


GTCAGCGTAC 


2520 


CAGTCTTATC 


AGAAGCGATG 


ATTTCAGTTG 


AACCAAGTGT 


TTCAACTGCT 


GGCAACTTAC 


2580 


GAACGATGGA 


ATGTCGTTTG 


GCCAAAACTT 


GAGTACCAAG 


AGAAAGAACG 


ATGGTAACGA 


2640 


TAGCACGAAG 


TCCTTCTGGA 


ATGGCTGCAA 


CGGCAAGGGC 


AACAGAAGTC 


AACAACTCAC 


2700 


CAAGTGGATT 


TTTCCCTTGA 


ATGAAGACAC 


CCACTACAAA 


AGTAACAAGG 


GCAATGACCA 


2760 
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AGATAGCATA GGTCAAGACC TTAGAAAGGT TGTTCAAATT TTGTTTGAGT GGTGTATCAG 2820 

TCTCATCCGC ATCTTCAAGC ATACCAGCAA TATGACCAAC TTCAGTGTAC ATACCTGTAT 2880 

TGACAACAAC ACCCATCCCA CGACCATAGG TTACGTTTGA GTTTTGGAAG GCCATGTTGA 2940 

CACGGTCACC AATACCAGCA TCTGTCGCAA GCTCGACTGA CAAGTCTTTT TCGACTGGTA 3000 

CAGATTCACC TGTCAAGGCT GCTTCTTCAA TTTTAAGAGA GTTGGCTTCT ATCAAACGTA 3060 

GGTCCGCTGG TACCACGTCA CCTGCTTCAA GGGCAACGAT ATCGCCTGGT ACCAATTCTT 3120 

TAGAGTCAAT CTCTGCCATG TGTCCATCAC GAAGAACGCG GGCAACTGGA CTAGACATGG 3180 

ATTTGAGGGC TTCAATAGCT TCTTCAGCTT TTCCTTCTTG GTAAACACCA AAGGCAGCGT 32 40 

TGATGATAAC CACAGCTAGG ATGATAATGG CATCTGCGAT ATCTTCCCCA CCAGAAGTCA 3 300 

CGACTGACAA GATTGctGCC GCAACTAGGA TGATAATCAT CAAATCCTTA AATTGCTCGA 33 60 

TGAATTTGAC CAAGATTGAT CGTTTCTCGC CTTCTTCGAG TTCATTGTGC CCAAATTCGG 34 20 

CAAGGCGCTT TTCCGCCTCA CTTGATGACA AACCTTGCTC GGTCGCATCC ACAGCCTGCA 34 80 

AGACCTCTTC AGGGCTCTGA GTATAAAACG CTTGGCGTTT TTGTTCTTTT GACATGTGTC 3540 

TCCTCCTTGA CATTGTGTGC AAAACAGACT CTCTTTCTCT CATAGCTTTT CACGACAAAC 3600 

AAAAAGAAAC CTGTTAATCA TAACAAGTCT CGCTGTTTAA GATAGGGCCG GAAAGCATAC 3 660 

TTTTCAGCAT AAAATTCGGA ATGACGACAC TATCACAGGT TTCTGCCAGC TACTCCCTTG 3720 

AGTAGTACCA TTATACCAAA TTTTGGGGAG TTTTCAAAGA GTAAAAACTG CCTTATTTGA 37 80 

ATTTTTCCTT GAAAACCAGT ATAATGGTAG AATGCTATGT GACTAGAAAG GAACTTGAAT 3840 

GAAGCAATCT ATCTCAAATC TCAAGTTAGC TGAGCGTGGA GCCATTATCA GTATTTCGAC 3900 

CTATTTGATC TTGTCTGCAG CCAAATTAGC AGCTGGTCAT CTCCtTCATT CATCCAGTTT 39 60 

GGTGGCCGAT GCTTTTAATA ACGTATCGGA CATCATTGGA AATGTGGCCC TCTTAATCGG 4020 

GATTCGGATG GCGCGCCACC TGCAGACCGT GACCACCGTT TTGGTCATTG GAAGATTGAA 4080 

GATTTGGCAA GCTTGATCAC TTCTATCATC ATGTTCTATG TCGGTTTCGA TGTTCTAAGA 4140 

GATACCATTC AAAAGATTCT CAGTCGGGAA GAAACGGTCA TTGATCCTCT TGGTGCAACT 4200 

CTAGGAATCA TTTCTGCAGC GATTATGTTT CTCGTCTATC TCTACAATAC TCGCCTCAGT 4260 

AAGAAATCCA ACTCCAATGC GCTGAAGGCA GCTGCTAAGG ACAATCTTTC TGACGCTGTT 4320 

ACCTCACTTG GAACCGCCAT TGCCATCCTA GCTAGTAGTT TCAATTATCC GATTGTGGAT 4380 

AAACTGGTTG CTATCATCAT CACTTTCTTT ATCTTGAAGA CTGCCTATGA TATCTTCATC 44 40 

GAGTCTTCCT TTAGTCTTTC AGATGGCTTT GACGACCGCC TCCTCGAGGA CTACCAAAAG 4500 

GCTATCATGG AAATTCCCAA AATCAGCAAG GTCAAATCGC AAAGAGGTCG CACCTACGGT 4560 
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AGCAACATCT ACCTGGATAT TACACTAGAG ATGAATCCTG ACTTGTCTGT TTTTGAAAGC 4 620 

CATGAAATCG CGGATCAGGT CGAGTCTATG CTGGAGGAGC CTTTTGGCGT CTTTGATACC 4680 

GATGTCCATA TCGAACCAGC ACCTATCCCT GAGGATGAAA TTTTAGACAA TGTCTATAAA 4740 

AAATTGCTTA TGCGTGAACA ATTGATTGAC CAAGGAAACC AACTAGAAGA ACTCTTGACT 4 800 

GATGATTTTC TCTATATTCG CCAAGATGGA GAGCAGATGG ATAAAGAGGC TTATAAGACC 4 860 

AAAAAAGAGT TAAATTCTGC TATCAAGGAC ATTCAAATTA CTTCCATCAG TCAAAAAACC 4 920 

AAACTCATCT GCTATGAGTT AGATGGTATC ATCCATACCA GTATCTGGCG TCGCCACGAA 4980 

ACCTGGCAAA ATATCTTTCA TCAAGAAACC AAAAAAGAAT AGAGAAATCC TTTCATGAGA 5040 

CGGGATTTTT CTATTCTTTT ATACTCAATA AAAATCAAAG TGCAAATTAG GAAGCCCGTC 5100 

ACAGGCTGTA CTTGAGTCGG CAATGTGAAG CCGACATAGT TTGCACTTTG ATTTTCGAAT 5160 

AGTCTTAACT ATC AAATTC A CTGAGATACT CATAGCGTTC GTATTTTTCA AGGAGTGCTT 5220 

CATTTTTCTC ATCCAATTCT TTTTGGAGAG TAGCCAGCTT ACCAAAGTCA GAGCCGTTAG 5280 

CCTGCATTTC CTCTTCAATA GCAGCGATAC GTTTTTCCAA GGTTTCAATA TCACCTTCAA 5340 

TACTTGCCCA CTCCTGCTTT TCTTGGTAGG TCATGCGTTT CTTGTCTTCT CGAACCTTGA 5400 

CCACTTTTTC CTTTTCGGCC TTTTGCACTT GATTGGCCAT ATCTGTTTCA AAAGCTTTTT 5460 

CATCAAGATA GTCGGTGTAA TGACCAAAGA AAGGACGAAT CTTGCCATCC TCAAAAGCGA 5520 

GAATCTTGGT CGCTACCTTA TCCAAGAAAT AGCGGTCGTG ACTGACTGTT AAAACGGGAC 5580 

CTGCAAAACC TTGCAAGAAA TTCTCTAAGA CTGTCAAACT TGCAATATCT AGGTCATTGG 5640 

TTGGCTCGTC TAAAAGAAGA ACATTTGGTT TTTCCAAAAG CAGTTTCAGG AGATAAAGAC 5700 

GTTTTTTCTC ACCCCCTGAC AATTTCTCAA TCAAACTCCC ATGCCTCGAA CCTGGGAAGA 57 50 

GGAATTGCTC CAGCAACTCA GCGATGGAAG TCGTAGAACC ACCACTGGTC TTGACCTCCT 5820 

CTGCCACTTC CTGCAGGTAA TTGATCACAC GCTTGCTTTC ATCCAAACCC TCAATTTGTT 5880 

GAGAGAAATA GGCGATGCGA ACAGTTTCCC CAATCACAAC TTGTCCTGCT GTCGGCTCAA 5940 

GACTTCCTGC AATCAGGTTA AGTAGGGTTG ATTTTCCAAC ACCATTGTCC CCAACAATTC 6000 

CAATACGGTC TTTAGCCTGA ACTAAGAGAT TAAAATTTTG CAAAATGCGC TTATTTTCAT 6060 

AGGCAAAGGA AACATCCTGA AACTCGATGA CTTTCTTCCC AATCCGACTG GTTTCAAAGT 6120 

TCATAGTCAA GTCTGTCTCA GCACTACTGC CTGAAACTTC CTTTTTCAGA TCATGGAAAC 6180 

GATTGATACG AGCTTGTTGC TTGGTCGCAC GCGCCTGCGG TTGTCTGCGC ATCCAGGCCA 6240 

ATTCTTGTTT CTACACT7GT TCTTTTTTGT GAAGAAGAGC CGCGTCGCGC TCATCCTGTT 6300 
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CCGCCTTTAG GCGAACATAG TCCTGGTAAT TTCCCTGGTA CTCGGTCAAG CCTGCACGAT 6360 

CCAACTCGAA AATCCGTGTT GACAAAGCGT CTAAGAAATA ACGATCGTGA GTGATAAAAA 6420 

GGACGGTCTT CTTAGAATTT TTCAAAAAGA GGGTCAGCCA CTCAATAATC GCAATATCCA 64 80 

GATGGTTGGT CGGCTCATCC AAAAGCAAGA GGTCGTGGTT GCCAACTAAG ACTTGTGCCA 6540 

ACTGTACCCG TCTTCTCAGA CCACCTGACA ATTCCCCAAC AGGAGTAGAT AAGTCTTGAA 6600 

TGCCCAATTT GCTAAGAACG GTCTTGACCT GACTTTCGAT TTCCCAAGCT TGGAGAGAGT 6660 

CCATCTCTGC CATGACACGT TCCAAACGCG CCTGCTTGTC CTCACTATAG TCGAGCATAA 6720 

TCAATTCATA CTCACGAATG AGCTGGATTT CCTTGAGTTC ACTAGATAGA ACCGTATCCA 6780 

AAACTGTCTT TCTATCATCA AAATCAGGAT CCTGAGTCAA GTAACCAATC TGGTAATCAT 6840 

TTTTAGCTGA AAAAGGACTG ACATCCCCAT CAAATCCAGA AACACCAGAA AGGACGTCCA 6900 

AAAGGGTGGT CTTGCCAGTC CCATTGACAC CGATTAAACC AATTCTGTCT AAGTCATGGA 6960 

TAATAAAGGA AATATCCCTA AAAACGGTCT TGTCACCAAC GGATTTACTT AGTTTTTCAA 7020 

CGATAAAATC ACTCATTTTT TCTCCCTCAG GTAAGCATGG ATGGCTTCAC GATTATTCTC 7080 

CAATTCTCCA TCGACAATGG CAAACTCAAT CTCTGTTAAA ATCTCTCCCA AGTCTGGGCC 7140 

TGGCTGATAG CCATATTCCT TGATCAAAAT ACCGCCATTA ATCTGAATCT CTTTCTTGTC 7200 

ATGGATAGTC AACCTTTGGT ATTTTTCTGT GATGGCTTGT GGGTTGACTT CTTTTCCTTG 72 60 

AGCTTGACGA AGATTTTCAG CCTGTAAAAG CAAATCTATG TCAAAGCGAT AACAATCTCG 7 320 

CTTGCTCAAT TCTCCATTTT CACGCAGAGC CAAAATAATC AGCAAATCCT GAACTTGCTT 73 80 

GGCAAACTGG CGTGAGGTCT TCCAAGATTT CAAAAATGAC TGCGCATTTT CAATCTCCAA 7440 

AGCCCATAGT AAAGCCGCCC AGGCTTGTTC AGAGGATTCA AAAGTAAAAT CAGTCTCCAA 7500 

ATCAAACAGT CTGTTGAGCT TGTCCTGGCT AGATGCCATA TCAGGGAGAT AGTCATAAGC 7560 

TTGACTCTCA ATCATGGAAG CCAAGCCCCT TCTCCAAAAT GGAGCCAGCA AGAGTTTATC 7620 

AAACTCGACG AAGGTACGCT CTACAGAAAT TTTCTCCAAA AGCGGCGTCA AGGTCTTCAT 7 680 

AGCTTTAAAT GTTTCTGGCT CAAGTGCAAA ACCAAGACTA GCCTGAAAAC GGAAACCACG 7740 

CATAATCCGT AAAGCATCTT CGTTGAAACG CTCACTAGCC ACTCCAACTG CTCGCAAGAC 7800 

TTGCTTTTCC AAATCTTCTA AACCATGGAA CAAGTCAACG ATTTCTCCTG TCTCATCCAA 7860 

GGCAAAGGCC TTGACTGTGA AATCACGGCG TTTGAGGTCT TCTTCTAGCG ATCGTACAAA 7920 

GGAAACCGCA CTGGGTCTGC GATAGTCCAC ATAGACATCC TCTGTCCGAA AGGTTGTTAC 7980 

CTCATACTCC TCATCCCCAT CTAAGACCAA GACGGTTCCA TGCTCGATTC CGATATCGGC 8040 

TGTTCGCGGA AAAATCTGCT TGGTCTCTTC TGGATAAGAA GACGTCGCAA TATCCACATC 8100 
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GTGGATAGGG CTATGGAGAA GGGCATCTCG AACAGAGCCC CCAACAAAAT AAGCCTCAAA 8160 

GCCTCCTTCT TTAATTTTTT CTAATACTGG TAAAGCCTTC TGAAATTCAG AAGGCATTTG 8220 

CGTTAATCTC ATAATAAGTG TTCTAATCCA TAGACAAGCT CATGACGCTT GACAACTTCT 82 80 

TTAATTCCCA AATTGACTCC TGTCATGAAG GAGATGCGAT CATAGGAGTC ATGACGGAGG 8340 

GTCAACCCTT CTCCCTGATT GCCAAAGATG ACTTCCTCAT GAGCTACCAA GCCTGGCAAA 8400 

CGAACTGAGT GGATGCGCAT ACCATCAAAC TCAGCACCAC GAGCACCAGC AATCAGCTCT 8460 

TCCTCATCTG CTGCACCTTG CTGAATTGAC TCTCGAACCT CTGCCATCAA CTCAGCTGTT 8520 

TTAATGGCTG TTCCACTCGG AGCATCCTTT TTCTTGTCAT GATGGAGCTC AATAATCTCC 9580 

ACATTTGGGA AATATTTGGC AGCCTGCGTC GCAAATTGCA TGAGTAAGAC AGCACCCAAG 864 0 

GCAAAGTTAG GGGCAATCAG GCCACCCAAG TCTTGGGCAC GAGAAAATTC TTTTAGCTCT 8700 

GCAATTTCTT CACTCGTGAA ACCAGTCGTT CCAACTACTG GAGCAAAGCC ATTTTCAAGA 8760 

* GCAAAACGTG TATTTTCGTA GGCAACAGCT GGAGTAGTAA AATCTACCCA GACATCCGCT 8820 

TCAAAACCAG CTAAATCAGC CTTATCCTTG AAAACAGGAA TACCCTGCCA TTCTGACTCA 8880 

GACTCAAAAG GATCCAAAAC TGCCACCAAC TCCAAGTCTC GATCAGTCAA TACCATCTGA 8940 

CAAGCAGCCT GGCCCATCTT TCCCTTAAAA CCGGCAATAA TTACTCGAAT ACTCATCTCT 9000 

ACTCCTGTCT AAGATACAAA GTCCGTAAGA ACACAAACTG AAAATAGGAA TTCCAATCAA 9060 

GAAGTGTCTA CTTCTTGGAA GAACTATCTT TTTCACACAG GGTTCCAGCC CTGTTCAATT 9120 

ATCAAGATAC AAAGGACCTT AGCTGCCTCT GAAAAATAGG GAATGGCACT CACTTTCCAC 9180 

GAAAGGCAAC ACAGGCATCT TTTTTCAAGA GGCAGGTAGT CCGTGTTCAA TTTCTAAGAT 9240 

ACAAGGCATC TTAACTAGCC TAGAAGCGCC AACTAAATCA CTGGAATATA ACCCAGAGCA 9300 

ATACTTCCTG CTCCTAGGTG CGTTCCAATG ACACTACCAA ATGTAGCAAG TGAAACATCC 9360 

GAACCCAAGC CAAAATCAAG CAAGTGcTGA CGCAATTCTT CAGCCTTTTC AGGAGCATTC 9420 

CCATGAATGA CAATGACCCG GTATTGACCT GAAGCCGTTG TTTCCTTGAT AATTTCAATT 9480 

AAGCGCTTGG TGGCCTTCTT TTCAGTACGA ACTTTTTCGT AAACTTCAAT CACACCTTGA 9540 

TCGTTAAAAT AAAGGATTGG CTTAATGCTA AGCAAATTCC CCAAAATGGC AGCCCCATTT 9600 

GAAAGGCGTC CACCTTTTAC CAAATGATCC AAGTCATCTA CCATGATAAA GGCTGACGTA 9660 

CGGCTGATTT GAATGGCTAG CTTATCCTGA ATGCTGGCAA AATCATCGCC CTGATCACGC 9720 

CAATTAAAGA CGCTTTCAAC CATGATGCCT AGGGGAGCAC TTGTAATCAA AGTGTCTGGG 9780 

AAAGCAATGG TTAAGCCCTC ATAGTCATCG ACCATATACT GGATATTTTG GTAAAAACCT 9840 
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GAAATTCCAG AAGATAGGAA AAGCCCCAAG GCATGTGTAT AGCCTTGTTC TTTGAGCGAA 
GTTAAGATCT CATCTAACTT GGCAATACTT GGTTGACTGG TCTTAGGCAA TTCAGAAGCC 
TGAGCCATTT TTTGGTAAAA TTCCTCAGCA GACAGATTGA TGCCTTCGAC ATATTCCTCA 
CCATCAATAT TGACAGGAAT ATCCAAGACA AACAACTCTT CTCTTTGCAA GATCTCTGCA 
CTGAGATAAG CAGAGGAATC TCTGAAAACA GCTAATTTCA TATTAGAACT CCAAATTAAT 
TCCTGGTAAG TCTAATGCAA TTTCAGTCAC TTCGTAAGTC AAACGATTGA GCATGTTCAA 
ACATGGACGA GCCAAGGTTT CCACCTCTTC TTGGTTCAAT TCACTTGGTT CATTGACAAT 
ACGGCCATCG ATATGGTTTA CTTGTGAGAT TGTTCCACTA ATGACAAACT TATCAAATAC 
AATCATAAAG CTCAAGATGA CAATCAAGGA AGTCACTTGA TTTTCTTGGT CATGTTGGAG 
CAATTGGAAA TTCACATCCA CCTTGGTTTC AGGAGCTCCA TTTTCATTTT CCCATTCAAA 
ATTACGCGCA TCAAAATGAT ACTGACTAAC AAATTCTTCT TCACGTTTAA GATTCATGTC 
TTTCTCCATC GGCTACAATA TTATAAGCTA TTGTACCATA ATTTTTTATT TTCATCTAGT 
TTTCTAGGAT TTAGTCAATC CCAATTTCAG CACGAACTAC ATCTGTGATG GTATCAACAT 
AGTAGTTTAC TTCTTCTGTT GTAGGCGCTT CTGCCATAAC ACGCAAGAGG GGTTCTGTTC 
CACTTGGACG AACAAGGATA CGCCCGTTCC CCGCCATTTC TTCTTCCATC TTCTCGATCA 
TGGCCTTGAT AGCTGGCACT TCCATGGCCT TTTCCTTCAT GACGTTTTCC ACTCGGATAT 
TAACTAATTT TTGTGGA TAA ATCGTTACTT CTGCCGCCAA CTCTGATAAG CTCTTACCAG 
TTTCCTTCAT GATTTTAGTC AATTGAACTG CTGATAATTG ACCA7CACCT GTGGTATTGT 
AATCCATCAA GATAACGTGA CCAGACTGTT CACCACCAAG GTTGTAGCCT GATTTTCTCA 
TTTCTTCAAC AACGTAGCGG TCACCAACTG CAGTAACTGC CTTGTTAATA CCTTCGCGAT 
TCAAGGCCTT GTGGAAACCA AGGTTAGACA TAACAGTTGT CACAATTGTA TTTTGAGCCA 
ATTGTCCTTT TTCAGAAAGG TATTTTCCGA TGATGTACAT AATCTTGTCA CCATCAACCA 
TGTCACCATT CTCATCAACA GCAATCAAGC GGTCACTGTC TCCATCAAAG GCCAAACCAA 
TAGCTGACCC ACTTTCTTTG ACCACTTCTT GAAGGGCTTC TGGATGTGTT GAACCAACAT 
TAAGGTTGAT GTTAAGACCG TCTGGTGTTT CCCCGATAAC CGTCAATTGG GCACCAACCT 
CTGCAAAGAT TTGACGGGCA CTGGTAGAAG CTGCTCCATT AGCTGTATCC AACGCAACCT 
TCATTCCATC AAGAGGAGTT CCAGTTGAAA CAAGGTATCC TTCATACTTA CGCArGCtTC 
TGGATAATCT ACCAAAATTC CTAAGCCTTC TGCACTTGGA CGAGGAAGAG TGTCTTCCTC 
AGCATCTAGC AAGGCTTCAA TTTCTGCTTC TTTTTCATCA TCTAGTTTGA AGCCATCACC 
GCCAAAGAAC TTGATTCCGT TATCAAGGGC TGGGTTGTGG CTAGCAGAAA TCATGACACC 
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GGCACTTGCT CCTTCAGTTT CAACCAAGTA AGCTACTGCT GGTGTTGCAA GGACACCAAG 11700 

TTTGTATACG TGAATCCCTA CTGAAAgGAG ACCTGCCACC AAGGCCGATT CCAACATTTC 11760 

CCCTGAAATA CGTGTGTCAC GTCCTACAAA GACTTTCGGC GCTTCCGTTT CATGTTGACT 11820 

AAGAACATAG CCTCCAAAAC GTCCTAGTTT AAAGGCTAAT TCTGGTGTTA GTTCTAGGTT 11880 

AGCTTCTCCA CGGACTCCAT CAGTCCCAAA ATATTTACCC ATTGTTATAA AATCCTTTTC 11940 

TATTTTTAAT TCGTTTTTGA ACTAGTTGCT TTCGTTGACG AAGATGTCTC CGATGAACTG 12000 

CTTGTACTTG AATTTGATGT GCTTGAACTT GGTGCTACTG GTTTTGTAGT CACCTTCATT 12060 

ATTGTATCAA ACGGAGTGAT AACTGCCGGT AAGACAACAC CATTGCGCTC GATTGCCTGC 1212 0 

AAAGGTACTG AACCACTGTA ATTACCTGTT ATACGTTCGC TAGTTGGCAA AACAGCGATA 12180 

ATCTTATCAA TTCTATCCAA TGTCTCTTGG TCACTCGTAA TAGACACTTC TTTATCTGAC 12240 

ACCATGACAT TTTCAATTTG TACCCGACTA TCAATTTGAC TAGGGTCAAT CTCTGGTACA 12300 

ATCTTTACCT TATCCTTCTG AGCCTTCTTA CCAATCTTGA CTGTAATTTT TTGCGGAGTC 12360 

GCCACAGCGG TCAGCCCATT GGGTAAATCT TCAATCCTCA AAGGAACTTC AATCGTTCCA 124 20 

ACACCGGCAT CTGTTAGGTC AGCAGTAACC TTGAATTTAC GTGTACTTTC TTGCATTTCA 12480 

CTAGCTAGCG ATAGGCGATT TGCACCAGTC AAGACCACTC ATACTTCTGA AGCAAAACCG 12540 

CTAATAAAAT ACTTATCACT ATTATAGCGT ATGTCAATAG GGACATTTGT TACTGTATTA 12600 

GTATACCTTT CCGTTTTTAC CTGCCTAGCA CTGGTACTGT TTTGAAAATT CGTCGCCCTA 12660 

GCATAGACAA ATAAGACACA AGCAAAAAAG AGTGAGGATA TGATATATAA ACTATTTTTT 12720 

TTCATGTTTC CATCCTCCTA GCAATCGTTC TTTAAAACTA AGACCCACTT CCTCTTTTGG 12780 

AAGTAAGATT TCACGTAATT CTGTTTCAAA TTCATCAAGT GTTAGGTTGT GCTTAAACCT 12B40 

TCCATTATAG GTTATCGAAA TTCCTCCCGT TTCCTCTGAT ACGACAAAAG TCAAGGCATC 12900 

TGAGACTTCT GATAAACCGA TAGCCGCCCG CTCTCTGGTC CCAAATTCCT TGGAAATCCC 12960 

TGTGTTTTTT GTCAAGGGCA GATAGGCAGA CGTCACAGCG ATACGTTCTT CTTTGATAAT 13020 

CACCGCACCA TCATGTAGGG GAGTGTTGGG AATAAAAATG TTAATGAGAA GTTCTGCACA 13080 

AATCTTAGCA TCCAAGGGAA TTCCTGTCGA AATATACTCC TGCAAGGTAC GTACACGCTG 13140 

AATAGCAACC AAGGCCCCGA TTTTACGAGG ACTCATGTAT TCAACAGACT TAACAAAGGC 13200 

ACGAATCATC TGTTCCTCAG CACTAATAGG GGCATTGGAA AAGAAATCTC TCGCTCTTCC 13260 

CAAACGTTCC AAACCAGTCC GAATCTCTGG AGAGAAGATA ACAACCGCCG CAATAACCCC 13320 

ATAAGTAATA ATTTGATTGA TTAACCAAGA AATCGTAGTC AAACCAATCA TATTTGCAAG 13380 
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GATTTGAGCT 


AAAATAAACA 


CCAAAACTCC 


ACGTACCAAA 


ATCATAATCT 


TGGTTCCTGC 


13440 


AATAGCTTTT 


GTAAAATGGT 


ATAAAATATA 


AGCAACAATC 


AAAATATCAA 


TCAGATTGAT 


13500 


AGCTATCGTC 


CATGGACTTG 


CAAACAAACT 


GGTCCAATAT 


TGCAGATTGG 


ATAATTGTTG 


13560 


AAAATTCATC 


CCTGATATCC 


TCCCTATCAA 


AACACTTTCG 


TCCTATTATA 


CCATTTTCTG 


13620 


GCATTTTTTT 


CCCTATCCTA 


GTCCATTTTA 


CATTGAACAA 


AAATATGATA 


AAATAAACTG 


13680 


ACTAAAAAAA 


ACAAAGGAGA 


AACTATGTCT 


CAACTCTATG 


ATATTACCAT 


TGTGGGTGGT 


13740 


GGTCCTGTCG 


GGCTTTTTGC 


AGCCTTTTAT 


GCCCACCTAC 


GCCAAGCCAA 


GGTTCAAATC 


13800 


ATCGACTCTC 


TTCCCCAGCT 


AGGTGGACAA 


CCTGCTATTC 


TCTACCCTGA 


AAAGGAAATC 


13860 


CTAGACGTAC 


CAGGCTTCCC 


AAACCTGACT 


GGAGAAGAGT 


TGACTAACCG 


CTTGATTGAA 


13920 


CAGCTAAATG 


GATTTGATAC 


CCCTATTCAT 


CTCAATGAAA 


CGGTTCTTGA 


GATTGACAAA 


13980 


CAAGAAGAAT 


TTGCCATCAC 


AACTTCTAAA 


GGAAGTCACC 


TGACTAAAAC 


AGTTATCATC 


14040 


GCTATGGGTG 


GCGGTGCCTT 


CAAACCACGT 


CCGCTGGAAC 


TTGAAGGGGT 


TGAGGGCTAT 


14100 


GAAAATATCC 


ACTACCACCT 


TTCTAACATT 


CAGCAATACG 


CTGGTAAGAA 


AGTGACGATT 


14160 


CTTCGTGGGG 


GAGACTCGGC 


TGTGGATTGG 


GCTTTGGCTT 


TTGAAAAAAT 


CGCACCAACT 


14220 


ACCCTTGTTC 


ACCGCAGAGA 


TAATTTCCGT 


GCCTTGGAAC 


ACAGTGTTCA 


AGCCTTGCAA 


14280 


GAATCATCTG 


TAACCATCAA 


GACACCATTC 


GCCCCTAGCC 


AACTCCTTGG 


AAATGGAAAA 


14340 


ACACTTGATA 


AACTTGAAAT 


CACAAAAGTC 


AAATCTGATG 


AAACTGAAAC 


CATTGACCTA 


14400 


GACCACCTCT 


TTGTCAACTA 


TGGTTTCAAA 


TCTTCTGTCG 


GTAACCTTAA 


AAACTGGGGG 


14460 


CTCGACCTCA 


ACCGTCACAA 


GATTATCGTC 


AACAGCAAAC 


AGGAATCCAG 


CCAAGCAGGT 


14520 


ATCTATGCTA 


TCGGTGACTG 


CTGCTACTAT 


GACGGAAAAA 


TTGATCTGAT 


TGCGACAGGC 


14580 


CTCGGAGAAG 


CTCCAACTGC 


TCTCAACAAC 


GCTATCAACT 


ACATTGACCC 


TGAACAAAAA 


14640 


GTACAACCAA 


AACACTCTAC 


TAGTTTATAA 


AAAAGAACCA 


CGAGTCACAT 


AGGATTCGTG 


14700 


GTTTTATAAT 


TCATCCCCTA 


TCTTATTGAT 


TTTTCTGAGT 


CTGTGATTGA 


CACCACTTTT 


14760 


GGTCAGAGGG 


GTGCTGAGAC 


TATCTGCTAA 


CTGCTGGATA 


GAGTAGTCTG 


GGTGCTGAAT 


14820 


CCTCAGTTGC 


GCCACTTCCT 


GCAAATCTAC 


TGGCAAATTT 


TCTAAGCCCA 


TGATATCTTT 


14880 


GATTTTACTG 


ATATTGTTAA 


TGGTCTTCAT 


GCTGGCAGAA 


ACTGTCCGAG 


CGATATTAGC 


14940 


TGTCTCGGCA 


TTATTAGCCC 


GATTGAGGTC 


GTTACGGGTT 


TCTCGCAAAA 


TCTTAACCCG 


15000 


CTCAAAATCA 


TCACGTGCCT 


GCATGGCTCC 


TATTACTATC 


AAGAACTCCA 


TAATGTCTTC 


15060 


TGCTCGCTGG 


AGATAGGTCA 


CAGCCCCCTT 


CTTGCGCTCA 


AGCACCTTGG 


CATCCAGTAA 


15120 


AAACTGTTGG 


AGAAGGGAGG 


CAATTCCTTG 


CGCGTGGTCC 


AGATAAACAG 


AACTGATTTC 


15180 
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CAACTGGTAC TTGCCTGACT CAGGGTCACG AATGCTCCCA TTTGCCAAGA AAGCGCCACA 15240 

GAGATAGGCA CGACCTGCTT CCTCATCCGA TAAAATCGCC TCATCAATAC CTGTTTCCAG 15300 

GCCAAAGAAA GAGTCTGCCA AGTGCAAATC ACTTAACAAA TCCTGCACCT TTTCATCTGT 15360 

AAAAACGGTA TAGACGCGAT TCTTGCGAAG ATTGCTCCGT TGGTGGTGAC GAATTTCAGA 15420 

TTTGATTTCA TAGAGATGGA GAAAGGACTC ATAGAGGTGA CGGGCCAGTT TGGCATTTTC 15480 

TGTCACAACT GACAAAGTCA AGCCCGAAGT CGAGAGACCG ATCCTACCAG ACATTTTGAT 15540 

AATGGCAGAT AATTCATGCC AGCTCAGATG GTGTTGGCCC AGGATTTCTT CTTTTACTGC 15600 

TACTGTGAAA CTCATTTTTT CACCTGTATA ATGCGCATCA ACTCGTCCAC AATCAAATCT 15660 

CCATCGTGGA AGGCACCGCC ATTTTCCAGA CGAAGGAAGT TAGATGAAAT CACGCGCGAA 15720 

ACTTGCTTAC AAAGACCTAC AAAATCGTGT TCCACTTGCA CTAAGTATTC ATCAAAACGG 15780 

TTGGAATTCA TGTATTCCTG AGGCACTTTT TCAATATTCA CCAAGACAGT GTCGATAAAA 15840 

GGGCGACCAA GGTGACGATG CAAGACTTCC ACGTGGTCGC TATCTGTAAA GTGTTCCGTC 15900 

TCCCCACGTT GGGTCATGAT ATTGCAGACA TAGGCAATTT CTGCCTTGGT TTCCAAAAGA 15960 

GCCCGCCCAA TTTCCTTAAT CACCATATTG GGCAAAATAG AGGTAAAGAG GGAACCTGGC 16020 

CCTAGGACAA TCATGTCACT TTCAAGGATG GTCTGCACTA CTCGACGGCT GGCCAGAGGC 16080 

GTATCATCGT TTAGGGCATT GGTCACATAG ACATTGTCAA TTATGCCTCG ATGGTCTACA 16140 

ATATGACTCT CTCCAGCCAC TTCTGTCCCA TCCTGAAAGA CTGCATGAAG GGTCAAAGGA 16200 

TGGTCACTGG AAGGATAAAT TTTCCCTGTT GTATGCAAAA ATTTGCTCAA TAACTGCATG 16260 

GCATTATAGG TTGAACCCTG CATTTCTGAC AAGCCAGCAA TGATGAGATT TCCCAATGGA 16320 

TGGCCAGCAA AGGCTCCGGC ATCCTCAGAG AACCGATACT GAAAGACCTT CTCATAAAAC 16380 

TTAGGCATAT CCGACATGGC CACAAGGACA TTACCAAGAT CACCTGGCGG TGTCAACTGT 16440 

TGCATATTTT TTCGGAGTTC ACCTGAAGAA CCACCATCAT CTGCCACCGT CACGATAGCT 16500 

GCGATTTCCA CATCTTTTTC CCGCAGACTT TTTAGAATGA CGGGACTTCC AGTCCCTCCA 16560 

CCAATCACCG TTATCTTTGG TTTTCTCATG AACGGTTTAC CGTTTCCTTT CTGCGGTCTT 16620 

TGTCGCGATG CCCTTCATTA ACAGACCAAT TCTTGGATAA GTCCTGCGCC AAGCGTTTAG 16680 

CAAATGCCAC ACTACGGTGT TGTCCACCCG TACATCCCAT GGCAATGGTC AAAACGGACT 16740 
TACCTTCCTT TTGGTAACTT GGCAGAATCG GCTCAATCAA GGCCAATAAA TGTTGATAAA 16800 
AGTCTTCTGA CTCAGGATGG TTCATGACAT AATCATAAAC AGGTTCATCC ACACCCGTTT 16860 
CGTTTCTCAG TTCTGGTAAA TAATAGGGAT TTGGCAAGAA ACGGACATCA AAGACCAAGT 16920 
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CCGCATCAAT CGGGATTCCA TACTTAAATC CGAAAGACAT GACTTCGATA CGGAAAGACT 16980 

GGGCTTGTTC TTGGTCTGAA AACTGCTCTG CAAGGGTTTT GCGCAgCTCA CGTGGAGTGA 17040 

GTTCAGTCGT ATCCACCACA TTTTGGCTCA TATTTTTCAA AGGTGCCAAG AGTTCACGTT 17100 

CCAACTTGAT TCCATCTAAA ATACGACCGT CTGCTGCTAG TGGGTGACTC CGTCTGGTTT 17160 

CCTTGTAACG AGCGACCAAT TCCTTATCAG CCGCATCCAA AAAGAGGATT TTGAAATCCA 17220 

AACCATCTTG ATTTTCCAAC TCATCCAAAA CAGCTTGAAT CTCTGAAAAG AAAGAACGGC 17280 

TACGCATATC CACTACCAAG GCCAACTTAG GATTGTCTTC CTTAATTTCA ACCAGCTGCA 17340 

AAAACTTAGG CAAGAGAGCT GGCGGCATAT TATCAATGGT GAAATAACCT AGATCCTCGA 17400 

AGGACTGAAT GGCTACAGTT TTCCCTGCGC CACTCATCCC TGTCACAATC ACCAAGTGAA 17460 

GTTGTTTCTT TGTCATCTTT TTCTCCTTA? ATCAAAAGAA GTTTGGCAAC ACCAAACTTC 17520 

AACTAGCTTA TCCAATCTCT GCGATGACTT CAATTTCGAC TTTTACATCA CGAGGAAGAC 17580 

GAGCTACCTC CACAGCTGAA CGAGCTGGGA ATTCCTCTTT GAAGGCCGTT TGGTAAACCT 17640 

CATTAAAAGG AACAAAGTCG TTCATATCGC TCAAGAAGCA AGTTGTTTTG ACAACATGGT 17700 

CAAAGTCTGT TCCTGCTTCT GCCAAAATAG CACCGATGTT TTTCAACACT TGCTCTGTCT 177 60 

GTTCTTGGAT ATTCTCTCCT ACAATTTCCC CAGTTTCAGG GGATACGGGA ACTTGACCGC 17820 

TAGCAAACAA AAGGTTGCCA ACGATTTTTC CTTGAACATA GGGTCCGATA GCCTTTGGGG 17880 

CCTTATCTGT ATGAATTCTT TTTGCCATTT TCTTTTCCTC ACAATTTTTC TAAGATTGCA 17940 

TCCCAAGCCT CATCCATCCC TGCCTTACTG ACAGATGAAA AGAGGATGAA ATCGTCACTC 18000 

GGCTCAAAGT TTAATTTCTT TTTGATTCCT GATTCATGCT TGTTCCATTT ACCACGACGA 18060 

ATCTTGTCCG CCTTGGTCGC CACAATGATG ACTGGAATCT CATAATACTT GAGAAATTCG 18120 

TACATCTGCA CATCATCTGC TGACGGGTCA TGACGAAGGT CAACTAGACT GACAACCGCA 18180 

CGGAGATTTT CCCGAGTCGT TAAGTACTCC TCAATCATGC ACCCCCACTT TTCACGTTCC 18240 

TTTTTAGAAA CACGAGCATA GCCATAACCA GGCACATCCA CAAAGCGCAT CTTGTCATCA 18300 

ATGTTAAAAA AGTTCAGGAG CTGGGTTTTA CCAGGTTTTC CTGATGTACG GGCGAGATTC 18360 

TTACGGTTCA ACATAGTGTT GATAAAGCTC GATTTACCAA CATTTGAACG CCCTCCTAGG 18420 

GCAATCTCTG GCAGTTCATC CTGCGGATAG TGGGACTTAT TAGCTGCACT GAGCAAGATT 18480 

TCAGCATTGT GTGTATTAAG TTCCATAGTC ACCTCTAGGC TGTTTCTAGG ATCGGTTTAT 18540 

CCGTTCCATC TACAGTTTCT TTAGTGATGC GAACCAATTT CACATTTTCC TGACTCGGCA 18600 

CCTCAAACAT GACATCTAGC ATGGTTTCTT CGATGATGGA GCGAAGTCCA CGCGCCCCTG 18660 

TCTTCCGTTC GATTGCTTTA TTAGCAATCT CTTGAAGGGC TTCGTCGTCA AATTCCAACT 18720 
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CAACATCATC 


ATAAGAAAGC 


AAGGTTTGGT 


ATTGTTTCAC 


CAAGGCATTT 


CTTGGCTCTT 


18780 


TCAAGATGCG 


AACCAAGTCA 


TCAACGGTCA 


ATTGCTCAAG 


AGCCGCAAAA 


ACACGCAAGC 


18840 


GTCCAATCAA 


CTCAGGGATA 


ATACCAAATT 


TTTGAATGTC 


TTCAGCGATG 


ATTTCTTGCA 


18900 


TGTATGAGCT 


GTTTTCGTCA 


ATCGCCTTAT 


TATTTTGACC 


AAATCCGATG 


ACTTTTTCAC 


18960 


CCAGACGTTG 


TTTGACAATT 


TCTTCAATAC 


CATCAAAAGC 


ACCACCCACG 


ATGAAGAGGA 


19020 


TATTTTTTGT 


ATCCACTTGA 


ATCATCTCTT 


GTTGTGGATG 


TTTGCGTCCA 


CCTTGAGGCG 


19080 


GTACGCTAGC 


AACAGTTCCC 


TCAATAATCT 


TGAGAAGGCC 


TTGTTGCACC 


CCTTCACCAG 


19140 


AAACATCACG 


TGTGATAGAC 


ACATTCTCAC 


TCTTCTTGGC 


AATCTTGTCA 


ATTTCATCCA 


19200 


CATAGATAAT 


GCCACGCTCT 


GCACGTTCGA 


TGTTAAAGTC 


AGCAACCTGC 


AAGAGTTTGA 


19260 


GGAGGATATT 


TTCCACATCC 


TCACCCACAT 


AACCAGCCTC 


CGTCAGAGCT 


GTCGCATCCG 


19320 


CAATAGCAAA 


AGGTACATTC 


AAGCTCTTAG 


CCAAGGTCTG 


GGCAAGGAAA 


GTTTTCCCTG 


19380 


AACCAGTTGG 


GCCAATCATC 


AAAATGTTTG 


ACTTCTGCAA 


ATCCACATCT 


TCTGACTCTT 


19440 


CGCGTGTATC 


GTGGAAATTG 


ATGCGTTTGT 


AGTGGTTATA 


AACCGCCACT 


GCCAAGGCAC 


19500 


GCTTGGCACG 


ATCTTGACCA 


ATTACATAGT 


GGTTCAAGAT 


ATGGAGGACT 


TCAATTGGTT 


19560 


TTGGCACCTC 


AGACAAGTCT 


GCCAAGACTT 


CCTCAACCAA 


TTCTTCTCGA 


ATGATTTCCrr 


19620 


GAGCTAACTC 


CACGCATTCA 


TTACAAATAA 


AAGCATTGTT 


GCCAGCAATT 


ATTTTTTGTA 


19680 


CTTCTTCTTG 


GTTTTTGCCA 


CAAAATGAGC 


AATAAACCAT 


CATATCATTT 


TTTCTATTTG 


19740 


TAGACATCAT 


TTCCTTCCA? 


TCTATACTGT 


CATTCTATCT 


AAAATAAGGT 


CATGTAAAAA 


19800 


CCATGAATAC 


TATTGACCAG 


ATTCCTAAAG 


GCATTTAACC 


AAAGGAGGAT 


AGAAAGCCCG 


19860 


TAACGCTTTT 


TACGAAAAGC 


TTGTGCTCCT 


GCCAGAAAGC 


AGATGAAACA 


CAGAAAAGCC 


19920 


GTGAATAGAC 


CAAATAAACT 


CCGTTCCATT 


AGACTTCCTT 


TCTCTTGCCG 


TATTGGATCG 


19980 


TAAAATCATA 


AGGATTCTTC 


TCATCTTTGG 


CGTAAAATTT 


GCTTGAAACT 


GTCTCAAAAA 


20040 


GAGACAAGTC 


AAGTTCTTCA 


GGGAAATAGG 


TATCTCCTTC 


CACCCGAGCA 


TGAATGTGAG 


20100 


TGACAATCAC 


TTCATCAAGG 


TAAGGTTCAA 


AAGCCTGAAA 


AATTTGCTTC 


CCACCGATAA 


20160 


TGTAGAGATT 


CTTTTCTTGA 


GCCTGATACC 


AGTCAAGAAC 


AGACTGGACG 


TCCTGAAAAG 


20220 


TAGCAACCCC 


ATCTATCTTT 


TCTTCCGGAT 


TACGCGTCAA 


AATCAAGGTT 


TCCCGTTTTG 


20280 


GAAGCAAGCG 


ACGCCCCATC 


CCATCAAAGG 


TCACACGCCC 


CATCAAGATA 


GCATGATTCA 


20340 


GAGTTGTTTC 


TTTAAAGTGC 


TGCAATTCTG 


CTGGCAAATG 


CCAACGCAGA 


CGATTTTCCT 


20400 


TACCAATCAC 


ACCCTCTTCA 


TCCTGGGCCC 


AAATAGCTAC 


GATTTTCTTA 


GTCATGCTTC 


20460 
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CATCCTTTTC ACTGATAGTA CTATTTTATC AAAAAACTCA AAAAAAGACT GGTTTGGAAT 20520 

AGCTTACAAA ATAGAAAAAA TCTGTAAGAA ATTTCCTACA GATTTATCTA TCTTTCCTTA 20580 

TTTCTTACAA ACCAGGTGCT TGTCCAAGTT CGGCTGCAAG CATCCAAATT GTTTTATCTG 20640 

TTTCAGTTTT AGCGCCTGCA AAGATACCGT TTGTCACATC GTCACCTTCT TCATCAGTGA 20700 

CATCCAAACC TTTTTGGAAA AGTTCTGACA AGTAACGGTA GATAACAAGA ACACGTTCCA 20760 

AGCTTTCTTC AACATTACGC TATTCACCAG CTTCTTCTTC GATTTCACTA TTTTGAAGGA 20820 

ACTCTGTCAA TGTAGAGAAT GGGCTTCCAC CGAGTGTAAT CAAGCGTTCA CTGATTTCAT 20880 

CCAATTGACC GTCAAGAGCT TCCATGTACT CATCCATTTT TGGATGCCAT ACAAGGAAAC 20940 

CACGACCATG CATATACCAG TGCACTTGGT GCAAAGCAAC GTGAGCTACA TACAAATCAG 21000 

CAACAGCTTG GTTCAAGACT TCCTTTGTTT TTGCCAATGC 21040 
(2) INFORMATION FOR SEQ ID NO: 56; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2387 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

ATTCTTAATA CGATTAAAAG GCTTATTACT AAAAGAAAAT TTCAGTTACA TGAACTAAAC 60 

TTGCTCGTCA AATCCCGATT TAACGAGATG TTTGGGGAAA ATAAAATATT TGAAAGCATT 120 

GATAACTTAT TTGATATTAT AGATGGTGAT AGGGGCAAAA ATTATCCTA* ATCAGATGAG 180 

TTGTTTAGTG AGGAGTACTG TTTATTTTTA AATACAAAGA ATGTTACTAA AAACGGATTT 240 

TCATTCGATA CAAAGCAATT TATCACTAAA ACAAAGGATA AATTACTTCG AAAAGGCAAA 300 

CTTGAGCGTT ATGATATAGT CTTGACAACA AGAGGTACTG TTGGAAATGT AGCGTACTAC 360 

GATGAATTAA TAAAATATAA ACATTTACCT ATAAATTCAG GTATGGTAAT ATTACGTCCC 4 20 

AAGACACCAA ATCTAAATCA GAAATTTATT ATCCATGTTT TAAGGAATAA TAATTATAGT 480 

CGAGTGATAT CAGGAAGTGC TCAGCCTCAG TTACCAATTA CAAAATTAAA AAAAATACTT 540 

CTCCCCCTCC CCCCACTAGC CCTCCAAAAT GAGTTCGCAG ACTTTGTAGT CCAGGTCGAC 600 

AAATCACAAT TGGCAATCCA AAAATCTCTG GAAGAACTTG AAACTTTGAA GAAATCTCTG 660 

ATGCAGGAGT ATTTTGGCTG ATATTCTGCC ATTGTAATTA CGGTAATGAT TTGTTATAAT 720 

ACTTCAAAGG AGGAAATCAG ATGGTAGTAA AAACAAGAAA ACAAGGAAAT TCAATCACCA 780 

TTACGATTCC AAGTGAATTT AATATTCCAA GTGGTGTTAA ATACGAAGCG AAATTGTTAC 840 
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CAAGTGGTGA 


GATTATCTTT 


ACTCCTGAAG 


AATTGGGGCA 


GCAGGTTTCT 


TATGTATCTG 


900 


ATGATGCCTT 


TGACTTAAAT 


TTAGATAAAA 


TATTTCACGA 


ATACGACGAT 


GTTTTCAAAG 


960 


CTTTGGTGGA 


AAAATGACAA 


TCTATTTGAC 


AGAAAAGCAA 


ATTGAAAAAA 


TAAATGCTTT 


1020 


AGCAATTCAA 


CGGTATTCTC 


CAAATGAGAA 


AATTCAAACA 


GTTAGTCCTT 


CTGCCTTAAA 


1080 


TATGATTGTG 


AACTTACCAG 


AACAATTTGT 


CTTTGGGAAG 


CCTCTTTATC 


CAACAATTTT 


1140 


TGATAAAGCA 


ACGATACTAT 


TTGTCCAATT 


GATAAAGAAG 


CATGTTTTTG 


CTAATGCTAA 


1200 


TAAAAGAACT 


GCTTTCTTCG 


TTTTGGTCAA 


ATTTTTACAA 


TTAAACGGCT 


ATCGTTTTTC 


1260 


TGTAACGCTA 


GAAGAAGCAG 


TAAAAATCTG 


TGTAACCATC 


GCAGTAGAAG 


CTTTAACTGA 


1320 


TGAAAAAATG 


ACAAGCTACT 


CCAAATGGAT 


TTCTGAACAT 


TCTGTTAGAG 


AAAAGGTCAA 


1380 


AAAGTAACCT 


AGTATGCTGG 


ATTTGAATGA 


GCACAAGAAA 


ATAAATGAAC 


AGACAATATT 


1440 


AGAATTCTGT 


AATGCAGAAA 


CTGATATTGT 


CTCTTTTTAT 


TGATGAATAA 


GAAAGTGAGA 


1500 


AATTATGGAA 


TCAAAAGTTA 


CAATTATCAT 


GCAAGAAATG 


TTACCTCTTT 


TAAATAATGA 


1560 


ACAATTACTA 


GCGTTGAGAG 


ACAGTTTACA 


ACATCATCTA 


GTAGACGGAA 


AAAAGCAGCA 


1620 


GAAGTATTCG 


AATAATAACC 


TGTTGCAACT 


ATTTATTACC 


GCCAAGCAGG 


TAGAGGGCTG 


1680 


TAGCTCAAAA 


ACAATTCGTT 


ATTATCAGAG 


GACGATTGAA 


AACTTGTTTA 


ATCCTATTAA 


1740 


AGAGTCTGTG 


ACACAACTCA 


CAACAGATGA 


TTTAAGGAGT 


TATTTAGCAA 


ATTACCAGTC 


1800 


TGAAAAGGAT 


TGTAGTAAGG 


CAAATTTAGA 


CAATATTAGG 


CGTATATTGT 


CTTCTTTTTT 


1860 


TGCTTGGCTT 


GAGCAAGAGG 


ATATATCATT 


AAAATTCCCA 


TTCGACGGAT 


ACAGAAAATT 


1920 


AAGACTGAGC 


AAAATGTGAA 


GGAAACTTAT 


ACTCATGAAC 


ATTTGGAAAT 


TATGCGTGAT 


1980 


AACTGTGAAA 


ATTTGAGAGA 


TTTGGCAATA 


ATAGACCTAC 


TAGCATCGAC 


AGGTATGCGT 


2040 


GTAGGCGAGC 


TTGTACAGTT 


GAATCGTTCA 


GATATTGATT 


TTGAAAACAG 


AGAGTGTGTT 


2100 


GTCTTTGGTA 


AAGGAAAGAA 


GGAGAGACCA 


GTATATTTTG 


ACGCTCGTAC 


GAAAATTCAT 


2160 


TTAAGAAATT 


ATCTTAACGA 


CAGAAAAGAT 


AGTCACCCTG 


CTCTTTTTGT 


AACGCTAGTT 


2220 


GGAAAAGTCC 


AGAGGCTTGG 


AATTGCTGGT 


GTAGAGATTC 


GCTTAAGAAA 


GTTAGGAGAC 


2280 


AAACTCGGCA 


TAC AAAAGGT 


TCACCCACAT 


AAGTTCAGAA 


GAACTTTAGC 


GACTAAGGCA 


2340 


ATTGATAAAG 


GTATGCCTAT 


CGAACAAGTC 


CAAAAACTGC 


TAGGTCA 




2387 


(2) INFORMATION FOR SEQ ID NO: 57: 









ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10669 base pairs 
(3) TYPE; nucleic acid 
(C> STRANDEDNESS : double 
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(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 : 

ATATTAAAGC GACTTTCTGT GCGCTAGGGA AAAATGTTCC TGGGAATGAG GACTTGGTGA 60 

AGAGGATAAA ATCTGAAGGT CATGTTGTTG GAAACCATAG CTGGAGCCAT CCGATTCTCT 120 

CGCAACTCTC TCTTGATGAA GCTAAAAAGC AGATTACTGA TACTGAGGAT GTGCTAACTA 190 

AAGTGCTGGG TTCTAGTTCT AAACTCATGC GTCCACCTTA TGGTGCTATT ACAGATGATA 24 0 

TTCGCAATAG CTTGGATTTG ACCTTTATCA TGTGGGATGT GGATAGTCTG GACTGGAAGA 300 

GTAAAAATGA AGCATCTATT TTGACAGAAA TTCAGTATCA AGTAGCTAAT GGCTCTATCG 360 

TTTTGATGCA TGATATTCAC AGTCCGACAG TCAATGCCTT GCCAAGGGTC ATTGAGTATT 420 

TGAAAAATCA AGGTTATACC TTTGTGACCA TACCAGAGAT GCTCAATACT CGCCTAAAAG 480 

CTCATGAGCT GTACTATAGT CGTCATGAAT AAGCAAGAAA AAATAGGTCT GTTAGATATT 540 

TGACAGACTT ATTTTTTACA GAATATAGTA CTACTTAAAA AATGTTTTAT GCTATAATTG 600 

ATGAATAAAA TAGAAGGAGA AGCATATGAA TACCTATCAA TTAAATAATG GAGTAGAAAT 660 

TCCAGTATTG GGATTTGGAA CTTTTAAGCC TAAGGATGGA GAAGAAGCCT ATCGTGCAGT 720 

GTTAGAAGCC TTGAAGGCTG GTTATCGTCA TATTGATACG GCGGCGATTT ATCAGAATGA 780 

AGAAAGTGTT GGTCAAGCAA TCAAAGATAG CGGAGTTCCA CGTGAAGAAA TGTTCCTAAC 840 

TACCAAGCTT TGGAATAGTC AGCAAACCTA TGAGCAAACT CGTCAAGCTT TGGAAAAATC 900 

TATAGAAAAA CTGGGCTTGG ATTATTTGGA TTTGTATTTG ATTCATTGGC CGAACCCAAA 960 

ACCGCTCAGA GAAAATGACG CATGGAAAAC TCGCAATGCG GAAGTTTGGA GAGCGATGGA 1020 

AGACCTCTAT CAAGAAGGGA AAATCCCTGC TATCGGCGTT AGCAATTTTC TTCCCCATCA 1080 

TTTGGATGCC TTGCTTGAAA CTGCAACTAT CGTTCCTGCG GTCAATCAAG TTCGCTTGGC 1140 

GCCAGGTGTG TATCAAGATC AAGTCGTAGC TTACTGTCGT GAAAAGGGAA TTTTATTGGA 1200 

ACCTTGGGCG CCTTTTGGAC AAGGAGAACT GTTTGATAGC AAGCAACTCC AAGAAATACC 1260 

AGCAAATCAC GGAAAATCGG TTGCTCAGAT AGCCTTGGCC TGGAGCTTGG CAGAAGGATT 1320 

TTTACCACTT CCAAAATCTG TCACAACCTC TCGTATTCAA GCTAATCTTG ATTGCTTTCG 1380 

AATTGAACTG AGTCATGAGG AGAGAGAAAC CTTAAAAACG ATTGCTGTTC AATCGGGTGC 1440 

TCCACGAGTT GATGATGTGG ATTTCTAGAA AATCATAAAA AGAATTGTAC ATTATTCTAA 1500 

TTTTTGATAT AATAGTCAGC AGGAAAGAAA GTCTTATGGC GTTCTTCAAG CGAGCTTGGG 1560 

ATAGTGGGAG CCAAGTAGGG CAAAATAAAG GCCTGGCGCT TTCTGTAGTA TTTTCAAAAA 1620 
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CAATGAAGTA ATAAATTAGG GTGGAACCGC GTTTCTGACG CCCCTAGGTT AAATCAACCT 1680 

AGGATTGTCA GATGTGGTTC TTTTGCTTAT TCAGTCTATT GTGTGAAAGA AAGGAGAGCC 1740 

GTGGACAACC TTTATCTTGT AAAAGACGAT AGTCAACTAG CTACATTTCG TGATTTTGTA 1800 

GTAAGAAATA CTGAAAAGTT GAAAGATTAT CAATCTTTTT TAAAGAATGA ACTTGCAGTC I860 

TGTGATTTAC CGCAAGCTGT TATTTGGTCA GATTTTAATG CTGCTACACA GATTATTAGG - 1920 

GAAAGTGCTG TTCCAACCTA TACAAATAAT AGACGAGTCG TTATGACGCC TGATTTAGCT 1980 

GTTTGGAAAG AATTGTATTT GTATCAGTTG ATGGACTACG AGTGTTCTGA GCAAACTCAA 2040 

GCAATAGAAA GTCACTATCA TTCTTTATCT GAAAATTTCC TCTTACAGAT TGTAGGACAT 2100 

GAGTTAGCTC ATTGGTCGGA CATTTTTTAG ATGATTTTGA TGCTTATGAC TCTTATATCT 2160 

GGTTCGAAGA GGGGATGGTT GAATATATTA GTCGCAAGTA TTTCTTGACA GAAGAGGAAT 2220 

TTCAAGCCGA AAAAATTTGT AATCAATCTC TCGTAGAACT TTTTCAGAAG AAGTATAGTT 2280 

GGCATTCATT GAATGATTTT GGTTCTTCGA CTTATGATAA GAACTATCCA AGTATTTTTT 2340 

ATGAATACTG GCCCAGCTTT TTGACAGTAG ATAAGTTGGT AGAAAATTTA GGTAGTGTAC 2400 

AACCGCTCTT AGATTCTTAT CATTTATGGG CAAATACAGA AAAAACTTTT CCCTTGTTAG 2460 

ATTGGTTTGT TCAGCAGAAA TTAATTGAAA AAGAAATATA AAAACTAAAG GAGTAAACAA 2520 

TGTCTAAGAA ATTAACATTT CACTGCATCA GTGGCAGAGA CCTCCTTACA GTCGGGCTGC 2580 

TCCACGCTCA GCACTAGAGT GCCTGAGCTA GACGCAGTAC TAACTCGTCT TGCCTCGTAT 2640 

GATCGACGAG GCAGACTCGT GTCGCAAGTA ATTATTTTTT ATTAAGGAGT ATTCAATGTC 2700 

TAACAAATTA ACATTTCACT GCGTCAGTGG CAGAAACCTC CTTACAGTCG GACTCCCCTA 2760 

CGCTCAGCAC TACAGTGCCT GAGCTAGACG CAGTACTAAC TCCTCTTGCC TCGTATAATC 23 20 

GACGAGGCAG ACTCGTGTCG CAAGAAATTA TTTTTTATTA AGGAGTATTC AATGTCTAAG 2880 

AAATTAACAT TTCAAGAAAT TATTTTGACT TTGCAACAAT TTTGGAATGA CCAAGATTGT 2940 

ATGCTTATGC AGGCTTATGA TAATGAAAAA GGTGCGGGGA CAATGAGTCC TTACACTTTC 3000 

CTTCGTGCTA TCGGACCTGA GCCATCGAAT GCAGCTTATG TAGAGCCATC ACGTCGTCCT 3060 

GCTGACGGTC GTTATGGGGA AAACCCTAAC CGTCTCTACC AACACCACCA ATTCCAGGTG 3120 

GTCATGAAGC CTTCTCCATC AAATATCCAA GAACTTTACC TTGAGTCTTT GGAAAAATTG 3180 

GCAATCAATC CTTTGGAGCA CGATATTCGT TTTGTTGAGG ACAACTGGGA AAACCCATCA 3240 

ACTGGTTCAG CTGGTCTTGG TTGGGAAGTT TGGCTTGACG GAATGGAAAT CACTCAGTTC 3300 

ACTTATTTCC AACAAGTCGG TGGATTGGCA ACTGGCCCTG TGAC7GCGGA AGTTACCTAT 3360 
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GGTTTGGAGC GCTTGGCTTC TTACATTCAA GAAGTAGACT CTGTCTATGA TATCGAGTGG 3420 

GCTGATGGTG TAAAATACCG AGAAATCTTT ATCCAGCCTG AGTATGAGCA CTCAAAATAT 3480 

TCATTTGAAA TTTCGGACCA AGAAATGTTG CTTGAAAACT TTGATAAGTT TGAAAAAGAA 3540 

GCTGGTCGTG CATTAGAAGA AGGCTTGGTA CACCCTGCCT ATGACTATGT TCTCAAATGT 3600 

TCACATACCT TTAATCTGCT TGACGCGCGT GGTGCCGTAT CTGTAACAGA GCGTGCAGGC 3660 

TATATCGCTC GTATCCGTAA CTTGGCCCGT GTCGTAGCCA AAACCTTTGT CGCAGAACGC 3720 

AAACGCCTAG GCTACCCACT TTTGGATGAA GAAACAAGAG CTAAACTCCT AGCAGAAGAC 3 7 60- 

GCAGAATAAA GAGAGTGACA AATTACGAAA ATGGGCGAAC AGAGTGAGCC CTGAGCCAGT 3840 

TGCCGCAGTG ATGAAGGTAT CCTTAGTGAA ACTAAGGATA CTAGGCAAAA TTGGAGACTT 3900 

TTGGCTCCAA TTTTAGCAAT GAAACAACGA AGTTGGTTGC TTGCGTGCCA ATCACATAAG 3960 

GCAAACTGGA AAATAAAAAG ATACTTTTCG GAGAAAAAAC ATGACAAAAA ACTTATTAGT 4020 

AGAACTCGGT CTTGAAGAAT TACCAGCCTA TGTTGTTACG CCAAGTGAAA AACAACTAGG 4080 

CGAAAAAATG GCACCCTTCC TCAAGGGAAA ACCCCTGTCT TTTGAAGCCA TTCAAACTTT 4140 

CTCAACACCA CGTCGTTTGG CTGTTCGTGT AACTGGTCTT GCAGACAAAC AGTCTGATTT 4200 

AACAGAAGAT TTCAAGGGTC CAGCAAAGAA AATTGCCTTA GATAGTGATG GAAACTTCAC 4260 

CAAAGCAGCT CAAGGATTTG TCCGTGCGAA AGGTTTGACT GTTGAACATA TCGAATTCCG 4320 

TGAAATCAAG GGTGAAGAAT ATGTCTATGT CACTAAGGAA GAAATTGGTC AAGCAGTTGA 43 80 

AGCCATTGTT CCAGGCATTG TGGATGTCTT GAAGTCACTG ACTTTCCCTG TCAGCATGCA 4440 

CTGGGCGGGA AATAGCTTTG AATACATCCG CCCTGTTCAC ACTTTAACTG TTCTCTTGGA 450C 

TGAGCAAGAG TTTGACTTGG ATTTCCTTGA TATCAAGGGA AGTCCTGTGA GTCGTGGCCA 4560 

TCGTTTTTTG GGACAAGAAA CCAAGATTCA GTCAGCATTG AGCTATGAAG AAGACCTTCG 4620 

TAAGCAGTTT GTAATCGCAG ATCCATGTGA ACGTGAGCAA ATGATTGTTG ACC AAATCAA 4680 

GGAAATTGAG GCAAAACATG GTGTACCTAT CGAAATTCAT GCGGATTTGC TGAATGAAGT 4740 

CTTGAATTTG GTTGAATACC CAACTGCCTT CATGGGAAGT TTTGATGCTA AATACCTTGA 4 800 

AGTTCCAGAA GAAGTCTTGG TGACTTCTAT GAAGGAACAC CAGCGTTACT TTGTTGTTCG 4860 

TGATCAAGAT GGAAAACTCT TGCCAAACTT CATTTCTGTT CGTAACGGAA ACGCAGAGCG 4920 

TTTGAAAAAT GTCATCAAAG GAAATGAAAA AGTCTTGGTA GCCCGCTTGG AAGACGGAGA 4980 

ATTCTTCTGG CGTGAAGACC AAAAATTCGT GATTTCAGAT CTTGTTGAAA AATTAAACAA 5040 

TGTCACCTTC CATGAGAAGA TTGGTTCTCT TCGTGAACAC ATGATTCGTA CGGGTCAAAT 5100 

CACTGTACTT TTGGCAGAAA AAGCTAGTTT GTCAGTGGAT GAAACAGTTG ACCTTGCTCG 5160 
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TGCAGCAGCC ATTTACAAGT TTGACTTGTT GACAGGTATG GTTGGTGAAT TTGACGAACT 
CCAAGGAATT ATGGGTGAAA AATACACCCT TCTTGCTGGT GAAACTCCAG CGGTGGCAGC 
TGCTATTCGT GAACACTACA TGCCTACATC AGCTGAAGGA GAACTTCCAG AGAGCAAGGT 
CGGCGCAGTT CTAGCCATTG CAGACAAATT GGATACGATT TTGAGTTTCT TCTCAGTAGG 
ATTGATTCCA TCAGGTTCTA ATGACCCTTA TGCCCTTCCT CGTGCAACTC AAGGTGTGGT 
TCGTATCTTG GATGCCTTTG GTTGGCACAT TGCTATGGAT GAGCTGATTG ATAGCCTTTA 
TGCATTGAAA TTTGACAGTT TGACTTATGA AAATAAAGCA GAGGTTATGG ACTTTATCAA 
GGCTCGTGTT GATAAGATGA TGGGCTCTAC TCCAAAAGAT ATCAAGGAAC CAGTTCTTGC 
AGGTTCAAAC TTTGTTGTGG CAGATATGTT GGAAGCAGCA AGTGCTCTCG TAGAAGTAAG 
CAAGGAAGAA GATTTTAAAC CATCTCTTCA ATCACTTTCT CGTGCCTTTA ACCTGGCCGA 
GAAGGCAGAA GGGGTTGCTA CGGTTGATTC AGCACTATTT GAGAATGACC AAGAAAAAGC 
TTTGCCACAA GCAGTAGAAA CACTCATTTT ATCAGGACCT GCAAGTCAGC AATTGAAACA 
ACTTTTTGCG CTTAGCCCAG TCATTGATGC TTTCTTTGAA AATACTATGG TAATGGCTGA 
AGATCAGGCT GTCCCTCAAA ATCGTTTGGC AATCTTCTCA CAACTAACCA AGAAAGCAGC 
TAAGTTTGCT TGTTTTAACC AAATTAACAC TAAATAAAAT TTGATAAACG GACTTTATCT 
TATTACAAAG GAGAAGAAAT GGATCCGAAA AAAATTGCTC GTATCAATGA GCTTGCTAAA 
AAGAAAAAAA CAGAAGGCTT AACACCAGAA GAAAAAGTGG AACAACCCAA ACTACGTGAG 
GAGTACATCG AAGGTTATCG CCGCGCTGTT CGTCACCACA TTGAAGGAAT CAAAATTGTG 
GACGAAGAAG GAAACGATGT TACACCAGAA AAACTACGCC AAGTACAACC TGAAAAAGGA 
TTACATGGCC GTAGTCTTGA TGATCCAAAT TCATAATAAT ACTCTTCGAA AATCAAATTC 
AAACCACGTC AGCTTCACCT TGCCGTACTT AAGTACAGCC TGCGGCTAGC TTCCTAGTTT 
GCTCTTTGAT TTTCATTGAG TATATGTATT CTTTCTTTTA ACAAAGATAG ATGAAACGAT 
AACAAAGAGA CTAGCAGTTT GTGTTTGCTA GTCTTTTTTC GCTAAAAAAG GAACCATAAT 
GGTTCCTAAA AACTATCATT AGTAACTTGC ACCGGCTGTA GCGTCTGCGT CACCACCGTG 
GCCTCCAGCA TCCCCTGAAT CAGAAGCGCC AGAAGTAGCA TCGGCGTCTC CATGACCTCC 
GGCAGCAGGA GCAAATGGTC CGCTACCACC CACCAAACGT TGACCAGTCT CTTTTAGGTA 
CCAGTCAAGC CATGGTTGGA AGTTAAAGAC GATTTCATTG ATACCAGCGT ATGATCCATC 
AGGATAGTAC ATTGCTTGGT AGTTGTGAGT GTTGATAACA CCTGCAGGAG AACCTGGAAC 
GATCGTACGG ACGTATTCTT GGTTTCCGTT GCGAAGTGTT CCGATAACCC ACTCTACGTT 
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CTTCATACGT 


GCTGGTGGAA 


GAGAACCATG 


AACAGTCGAC 


ATACGGCTAC 


CTGATTGAGG 


6960 


TGGTACACGT 


TTAGCGAACA 


TAGTGTCTGG 


ATCTTGGTGA 


GCGTTGTTGT 


AGTAGAGGAA 


7020 


TTGGTTGTTG 


TCGTCAGCGT 


ATGTCAATTC 


AAATGGCATA 


GCTTTCAAGA 


ACATATCAAT 


7080 


TTGGTTAACT 


GTTAGGATAC 


CGTGGTCCAA TTTGACATAG GTATCACCAG 


AAACACCACC 


7140 


AGTGAATGCT 


GCAACTTTTT 


CTACCCATTC 


TGGATCGTCA 


GGGTCAACTT 


CTGTGATGGT 


7200 


TGTAGCGATT 


fitITT TT*CC AC 


AATCCAAGTC 


TTCTGATTCG 


ATTGGTTTTG 


GTTTTTTCAA 


7260 


TTTCGAAACG 


ACTCCTACGT 


ATTTAACAAA 


GTTATCTAAG 


CAACTTTCAA 


GGAATTTAAC 


7320 


AjGTGCCTT cg 


TTGGTGATAT 


TTCCGTTGTT 


ATCAAAAGCT 


TCCTTAGCTT 


TACCAAGAAG 


7380 




CCTGGAAGCG 


TGTAGGCATT 


AACACCTGGA 


GCATCAAGGA 


TTTTACGAAG 


7440 






TTCCTTGGTC 


ATAGTATGAT 


GCACCCACAA 


TCATAACAGG 


7500 


(" "*I w T , G*I w l*'I"l*f* A. 
Li 1U1 1 i 1 


AATGGATGAA 


CTTCGTATGA 


AAGCCATTCA 


AGTACAGATT 


TGAGTGAAGC 


7560 






CAGGAGTAGC 


AATGATAACA 


CCATCTGCAC 


GAGTAATTTT 


7620 


A A A 


T AACGT AATT 


GGAAACTTTC 


ATCCCATTTT 


TCATCTTGGT 


TAAACATTGG 


7680 




ATTTCAAGAA 


CTTCTAATTC 


AAATTTGAGT 


TTGAAGTAGC 


GACGGATAAA 


7740 






ATGATTGATC 


GTAGTTTGAT 


CCAACAAGTC 


CAACAAATTT 


7800 


i iv„i i 1 1 1 


flfSTf"" *TV"* r*"T* AT 


CTTACAAATT 


TTCCCAGTCA 


AAGTCTTCAG 


CATCTTTGCG 


7860 


nn V* 1 /U\ 1 ILL 


TGTGGATTAf' 


GTAATTTTTC 


TGTGATTTTT 


ACAAAGATAC 


GGAACTCATC 


7920 


&AAGATGGPA 




TGATAACATC 


AAGGTCAACC 


AAGTCGCCAC 


TTGGGTTAAA 


7980 


"^G r'TH A AG A 


GACTGTGAGA 


uUwunn 1 * v_ 


ATr""*GGAAGA 


ACATTTGCCT 


TGATTTCAGG 


3040 


AGGATTCAAC 


ATTTGACCAA 


GTTGCAATTG 


GGCACGAGAT 


GAACCAAGCG 


TACCCTAAGA 


s:oo 


AGGACCTGTA 


ATGATGATTC 


GTTTGTTCAA 


AAGTGGGTAA 


ATACCATAAG 


ACAACCAAGC 


8160 


AAGAGCGCTC 


ATCAAAACAG 


CTGGAATAGA 


GTGATCATAC 


TCAGGAGTAC 


CGATAATAAC 


8220 


GCCATCTGCC 


TCTTCGATTT 


TAGCAGCAAT 


TTCCAATATT 


TCAGCAGGTA 


CTTGCTTGTC 


8280 


AGCTGGTTTG 


TTGAAGACAG 


GAATGGCCTT 


GATTTCAACA 


AGTTCAATTT 


CAGCTTTGTC 


8340 


AGTAAAGTGT 


TTTTGCATGT 


ATTGAAGCAA 


TTGACGCTTT 


GTAGAACGTT 


TTGAATTTGT 


9400 


TCCAACAATA 


GCAATAAGTT 


TTAACATGAG 


ATTTCCTTTC 


TCTTTTTACA 


TAATACAATT 


8460 


TTAAAATTCC 


ATTGAAACAG 


TTGTCTCTAT 


AGAGTAGGAA 


TTCCTGAAGA 


ACAGCTTAGG 


8520 


TGGCCTTCTT 


TATCGATGAG 


GATGACTTCG 


ATGCCCTCCA 


AACTTTCGAC 


TTGCCAGAGG 


8560 


ATAGAAGCAG 


GTCTTTCTCC 


AAAGAGTCGA 


GTCGTCCAGA 


TTTCGCCATC 


GACTGATTTA 


8640 


TCAGAGATGA 


TTCTTACACT 


CGCTAGTTCC 


GTTTCAACAG 


GATATCCTGT 


TTGACTGTCA 


8700 
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AAAATGTGAT 


GGTAATCTTG 


TCCATCGACG 


GTCAGGTGAC 


GTTCATAAAT 


GCCTGAAGTC 


B760 


ACGACAGATT 


TATTGACAAC 


AGGGATGGTC 


ATTAAATGAT 


TTCCCCTAGG 


ATTGGCTGGG 


B820 


TCTTGAATCC 


CGATTTGCCA 


TGGGTTATCC 


CCTCTTGCCT 


GATTTTTTCC 


AATGGTCAGG 


B880 


ATATTCCCTC 


CCAGATTGAT 


CAAGGCAGAA 


GTCACCCCCT 


CTTTCCTAAG 


AAATTGGGCA 


8940 


ACCTTATCCG 


CACTGTATCC 


TTTGGCTAAA 


CAACCTAGAT 


CGATCTTCAT 


TCCTTTCTGT 


9000 


TTTAAAAACA 


CAGTAGAAGT 


AGAAGAATCT 


AACTCGATAC 


CATGAGGATT 


GATTAGAGGC 


9060 


AGCACCGATT 


CAATTTCTTG 


AGGCTGGGCG 


ACCTTGGCAT 


CTGAAAAACC 


GATACGCCAG 


9120 


GTTTGAATTA 


AGGGACCAAT 


GCTGATATTG 


AGGTGGCTAG 


AGAGCGCTAG 


GCTATGCTCT 


9180 


AACCCAAGTG 


AAATCAGCTC 


AAACAGGTCT 


GGATGAACCG 


TGACGGGGGC 


TATTCCTGCT 


9240 


TGATAATTGA 


TTTCCATCAA 


CTCAGATTCT 


TGACTATTGG 


CGTTGAAGCG 


GTATTCAAGT 


9300 


TCTTTGAGCA 


AGTCAAAGGA 


TTTTTGGAGA 


AAGATATCGG 


CTTGCTCATC 


CACTAATGAA 


9360 


ATAGTGATAG 


TAGTCCCCAT 


TAGCCGTTCA 


GAATGTGAAC 


GAAGAGTCAA 


GCTACCAACT 


9420 


CCTTTCTCTT 


ATAGAAAATA 


AGTTGTAATA 


TCAAATAATC 


ATCTAAATTG 


AAGCCCTTAC 


9480 


ATTTCATTTT 


CATGTTATTA 


TAATACCATA 


AAGTTAGAAT 


TTTCACAAAC 


AAAATTTGGA 


9540 


AAAAGTCAAG 


AAATATGCTC 


ATAAAATTCA 


TCAGGCTTGA 


AAACAGGATA 


AATGGGGAAT 


9600 


TATTTTTGAT 


AAAAAATCCT 


GAAATAATAG 


TACCCCCCTT 


GTAAACGCTA 


ACGGTAAATG 


9660 


GTATACTAGT 


AAGGTAAATT 


TAGAATGAAG 


GCAGGAAATT 


TTTATGAGTA 


AAATCGTTGT 


9720 


AGTCGCTGCT 


AACCACGCTG 


GTACAGCATG 


TATCAATACC 


ATGTTGGATA 


ATTTTGGAAA 


9780 


TGAGAACGAA 


ATTGTTGTAT 


TTGACCAAAA 


CTCTAACATC 


TCTTTCCTAG 


GATGTGGAAT 


9840 


GGCTCTTTGG 


ATTGGTGAAC 


AAATTGACGG 


TGCTGAAGGC 


TTGTTCTATT 


CTGATAAAGA 


9900 


AAAATTGGAA 


GCTAAAGGTG 


CTAAAGTTTA 


CATGAACTCA 


CCTGTTCTTT 


CAATCGACTA 


9960 


TGATAACAAA 


GTAGTTACAG 


CGGAAGTTGA 


AGGAAAAGAG 


CACAAAGAAT 


CAT ACGAAA A 


10020 


ATTGATTTTC 


GCTACAGGCT 


CTACACCAAT 


CTTGCCACCA 


ATCGAAGGTG 


TTGAAATTGT 


10080 


TAAAGGAAAC 


CGCGAATTTA 


AAGCAACTCT 


TGAAAACGTA 


CAATTCGTGA 


AATTGTACCA 


10140 


AAATGCTGAA 


GAAGTTATCA 


ATAAACTTTC 


TGACAAGAGC 


CAACACCTCG 


ACCGTATCGC 


10200 


CGTTGTTGGT 


GGTGGTTACA 


TCGGTGTTGA 


ACTTGCTGAA 


GCCTTTGAAC 


GTCTTGGAAA 


10260 


AGAAGTTGTC 


CTTGTTGATA 


TCGTTGATAC 


TGTCTTGAAC 


GGTTACTATG 


ACAAAGACTT 


10320 


CACACAAATG 


ATGGCGAAGA 


ACTTGGAAGA 


TCACAACATC 


CGCTTGGCTC 


TAGGTCAAAC 


10380 


TGTTAAAGCA 


ATCGAAGGTG 


ACGGTAAAGT 


TGAACGCTTG 


ATTACTGACA 


AAGAAAGCTT 


10440 



WO 98/18931 



PCT/US97/19588 



510 

TGACGTGGAT ATGGTTATCC TTGCAGTTGG TTTCCGTCCA AACACAGCCC TTGCAGGTGG 10500 

TAAGATCGAA CTCTTCCGCA ACGGTGCCTT CCTTGTAGAC AAGAAACAAG AAACATCTAT 10560 

CCCAGACGTT TACGCTGTTG GTGACTGTGC GACTGTTTAT GACAATGCTC GTAAAGATAC 10620 

AAGCTATATC GCTCTTGCTT CAAATGCTGT GCGCACTGGT AACGTTGGT 10669 



(2) INFORMATION FOR SEQ ID NO: 5B: 

(i) SEQUENCE CHARACTERISTICS: 

(Ai LENGTH: 7542 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



txi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

CGCGCTAATA GATACTTTAT GATAGAATAA AGAACAAGAT TGACAAGTAA GAGGAAACAT 60 

TATGCAAAAT CAAACACTCA TGCAATACTT TGAATGGTAT CTCCCCCACG ACGGTCAACA 120 

CTGGACGCGT CTGGCTGAAA ATGCTCCACA CCTAGCTCAT CTGGGGATCA GTCACGTCTG 180 

GATCCCACCA GCCTTCAAGG CAACCAACGA AAAAGATGTC GGCTATGGGG TCTATGACTT 240 

ATTTGACTTA GGAGAGTTCA ACCAAAAAGG GACTGTCCGC ACCAAGTATG GTTTCAAAGA 300 

AGACTATCTT CAAGCCATTC AAGCCCTTAA AGCACAGGGA ATTCAACCTA TGGCCGATGT 360 

AGTTCTCAAC CACAAGGCTG CTGCCGATCA CAGCGAAGCC TTTCAGGTTA TCGAAGTTGA 420 

TCCTGTAGAC CGTACAGTTG AACTTGGAGA ACCCTTCACC ATCAATGGCT GGACTAGTTT 480 

TACCTTCGAT GGTCCCCAAG ATACCTATAA TGGCTTCCAC TGGCATTGGT ACCACTTCAC 540 

CGGTACAGAC TACGATGCCA AACGCAGTAA ATCTGGGATT TATCTGATCC AAGGGGACAA 600 

CAAGGGCTGG GCCAACGAGG AATTGGTCGA TAACGAAAAC GGAAACTACG ACTACCTCAT 660 

GTATGCCGAC CTAGACTTTA AACATCCTGA AGTCATCCAA AACATCTATG ACTGGGCTGA 720 

TTGGTTCATG GAAACGACTG GTGTACCTGG TTTCCGTTTG GATGCCGTTA AGCATATTGA 7 B0 

CTCTTTCTTT ATGCGCAACT TCATCCGCGA TATGAAGGAA AAATACGGTG ACGATTTCTA 840 

TGTTTTTGGT GAATTTTGGA ACCCAGACAA GGAAGCCAAT CTGGACTATC TCGAAAAAAC 900 

GGAAGAACAC TTTGACCTTG TCGATGTTCG TCTCCACCAG AATCTCTTTG AAGCCAGTCA 960 

AGCTGGCGCA AACTATGACC TTCGTGGCAT TTTCACAGAT AGCCTGGTTG AACTCAAGCC 1020 

TGACAAGGCT GTGACTTTTG TCGACAACCA CGATACCCAA CGAGGACAAG CCCTTGAGTC 1080 

TACCGTTGAA GAATGGTTCA AGCCAGCAGC CTATGCCCTC ATTTTGTTAC GCCAAGACGG 1140 

CCTTCCATGT GTCTTTTACG GAGACTACTA TGGGATTTCA GGGCAGTATG CTCAAGAAGA 1200 
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TTTCAAAGAA 


ATCCTTGACC 


GCCTCCTAGC 


CATCCGAAAA 


GATTTGGCCT 


ATGGAGAACA 


1260 


AAATGACTAC 


TTTGACCATG 


CTAACTGTAT 


CGGTTGGCTA 


CGTTCAGCTG 


CTGAAAATCA 


1320 


ATCCCCAATC 


GCAGTCCTTA 


TCTCAAATGA 


CCAAGAAAAC 


AGCAAGTCAA 


TGTTTGTCGG 


1380 


TCAAGAATGG 


ACTAATCAAA 


CCTTTGTAGA 


TTTACTTGGT 


AACCACCAAG 


GTCAAGTTAC 


1440 


AATTGATGAG 


GAAGGTTATG 


GACAATTCCC 


TGTCTCAGCT 


AGATCCGTAA 


GTGTCTGGGC 


1500 


AGTCAATACC 


ATCTAATAGC 


TCATAATAAC 


CAAGCTAGGT 


CCAAGCGGAT 


TTGGCTTTTT 


1560 


TGTATTCACA 


AAAAGACCTA 


CCCAAATGGA 


TAGATCTTTA 


CTTGATTACA 


ATTTACCTGC 


1620 


TACTGCATCC 


AACAATTCTT 


GCATCTTAGG 


TTGGTTGCTT 


CCTCCTGCCA 


TGGCCATATC 


1680 


TGGTTTACCA 


CCACCACGTC 


CATCGATGAT 


TGGTGCTAAT 


TCTTTGACAA 


GGTTTCCTGC 


1740 


ATGAAGGTCT 


TTTGTCTTGC 


TTGCTACAAG 


GACATTGACT 


TTGTCACCGA 


TAGCGGCAAC 


1800 


TAGGACAAGA 


AGATCAGAGT 


AGTCTTTTTG 


TTTCCAGTTA 


TCTGCAAAAG 


TACGAAGGGC 


1860 


ACCGGCATCG 


GATACAGACA 


CTTGACTAGC 


AATGTAACGA 


TGACCGTTGA 


CTTCCTTAAC 


1920 


ATCTTTCAAG 


ATATCGCCTG 


CGGCTGCAGC 


TGCGGCTTTT 


TCTTTCAACT 


CAGCATTTTC 


1980 


TTTTTGAAGT 


TGACGAAGTT 


GTTCTTGAAG 


TCCTTCTACC 


TTGTGAGGTA 


CTTCCTTGAC 


2040 


TTGAGGTGCT 


TTCAAGGTTG 


CTGCGATACC 


TTTAAGAGCA 


TCCTCTTGTT 


CACGATAGGC 


2100 


TTCAAAGGCT 


TCCTTACCAG 


TCACTGCCAA 


GATACGGCGA 


GTTCCTGAAC 


CGATTCCTTC 


2160 


TTCTTTGACA 


ATTTTGAAGA 


GACCAATCTC 


AGAACTGTTG 


TCAACATGAG 


TACCACCACA 


2220 


AAGTTCAATA 


GAGTACTCAC 


CGATAGTCAC 


GACACGAACT 


TCCTTGCCGT 


ATTTCTCACC 


2280 


AAAGAGGGCC 


ATAGCTCCCA 


TTTCTTTAGC 


AGTGTCAATA 


TCCGTTTCAA 


CTGTCTTCAC 


2340 


TTCAAGTGCT 


TCCCAAATTT 


TCTCGTTAAC 


TTGCTGTTCA 


ATCGCACGAA 


GTTCCTCAGC 


2400 


AGTTACTGCT 


TGGAAGTGGG 


TAAAGTCAAA 


GCGAAGGAAT 


TCAACTTCGT 


TAAGAGATCC 


2460 


TGCCTGTGTT 


GCGTGGTTTC 


CAAGGATATT 


GTGAAGGGCA 


GCGTGAAGCA 


AATGAGTCGC 


2520 


AGTGTGGTTT 


TTCATGACAC 


GGTGACGGCG 


ATTGCTATCA 


ATTGCCAAGG 


TATATTCTTG 


2580 


GTTCAAGGCA 


AGCGGTGCAA 


GGACTTCAAC 


TGTATGAAGG 


GCTTGACCAT 


TTGGGGCTTT 


2640 


CTGAACATTG 


GTCACAGTAG 


CCACAACCTT 


ACCTGACTCA 


TCCAAGATTT 


GTCCGTAGTC 


2700 


AGCTACCTGT 


CCACCCATTT 


CAGCATAAAA 


TGACGTTTCC 


GCAAAGATAA 


GAGAGGCAGT 


2760 


TCCTTCTGAA 


ACAGCTCCTA 


CTTCTGCATT 


GTCAGCAACG 


ATAGCTACCA 


ATTTAGAAGA 


2820 


CAATTGGCTA 


GCATTGTAGT 


TGAAGACACT 


TTCTACAGTG 


ATGTTTTGAA 


GAGTTTCATT 


2880 


TTGCATACCC 


ATTGAGCCAC 


CCTTGACAGC 


TGACGCACGC 


GCGCGTTCTT 


GCTGTTCTTT 


2940 
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CATGGCTGCT TCAAAACCTT CACGGTCTAC AGTCATACCA GCTTCTTCAG CGATTTCTTC 3000 

AGTCAATTCA ACTGGGAACC CATAAGTATC ATAGAGTTTG AAGACATCTG AACCAGCGAT 3060 

AACAGATTGA CCTTTTTCTT TCAAGTCTGC TACAATGCCT TGGGCAAAGT GTTGACCTGA 3120 

GTGAAGGGTA CGGGCAAATG ATTCTTCTTC GCTCTTAACG ATTTTCTCAA TAAAGTCACG 3180 

TTTCTCAAGC ACTTCTGGGT AGTAGCTTTC CATGATTTTT CCAACAGTTG GAACCAATTT 3240 

GTAAAGGAAA GGCTCGTTGA TACCCAATTT TTGACCATGC ATAGAAGCAC GACGGAGAAG 3300 

ACGACGAAGA ACATAACCAC GACCTTCATT TCCTGGAAGG GCACCATCAC CGATAGCAAA 33 60 

TGAAAGAGAA CGAATGTGGT CTGCGATAAC CTTGAAGCTC ATGTTGTCGC CATCTTGGTC 3420 

ATAAACCTTA CCAGACAATT TCTCGACTTC ACGGATAATC GGCATGAAGA GGTCCGTTTC 34 80 

AAAGTTGGTC TTAGCCCCTT GGATAACGGC CACCAAACGC TCCAAACCAG CGCCCGTATC 3540 

AATGTTCTTA TGTGGCAATT CCTTGTATTC GCTACGAGGA ACAGCACGGT CTGCGTTAAA 3600 

TTGTGACAAA ACGATGTTCC AGATTTCAAT ATAACGGTCG TTTTCAATAT CTTCTGCAAG 3660 

CAGGCGAAGA CCGATATTTT CTGGGTCAAA GGCTTCCCCA CGGTCAAAGA AGATTTCTGT 3 720 

ATCTGGTCCA GAAGGTCCCG CACCGATTTC CCAGAACTTG TCCTCAATTG GAATCAAGTG 3780 

ACTTGGATCC ACTCCCACTT CAATCCAGCG CTTCTAAGAA TCTTTATCGT CTGGATAGTA 3 840 

GGTCATGTAA AGTTTTTCAG CAGGGAAATC AAACCATTCA GGGCTTGTCA AAAGCTCATA 3900 

AGCCCAAGTG ATAGCTTCGT CACGGAAGTA ATCCCCGATA GAGAAGTTCC CCAGCATTTC 3960 

AAACATGGTA TGGTGACGCG CGGTCTTCCC TACGTTTTCG ATGTCGTTGG TACGGATACC 4 020 

CTTTTGGGCA TTGGTAATAC GTGGATTTTC AGGGATAATG GTCCCGTCAA AGTATTTCTT 4 080 

AAGGGTTGCT ACCCCAGAGT TGATCCACAA AAGAGTTGGG TCATTTACAG GAACCAAACT 4140 

TACTGATGGT TCTACTGAGT GACCTTTGGT CGCCCAGAAA TCAAGCCACA TTTGGCGTAC 4 200 

TTGTGCACTA GATAGTTGTT TCATATTGTC TCCTTATTCA CTTGTTTAAT GTGATTGGCT 4 2 60 

TTCCAGCATT TCCACATAGT CAATCGCGAC ACAGAGGGAA ATGACTAGGT CTGCATAAGC 4 3 20 

GTCTTCAAGA ACCGTTACGG TATAGGTAGA AGTCAGATGG AAGAGTTCCT TCTTAATTTC 4 380 

CGCAATCAAC TGATCGCGAT CATCCAGCAA TTTGAAATTC AAATCCCAGA TATTGCCCTC 4440 

GATACGAAGA CCTAGATTAT CAAACTCATA CTTATCTCGC CAGAAGGTCA ACTTCTTACG 4500 

AATGACAAAA CTCGAGCCAT CCCGAAGCTG AATTTCAAAA CGAGGAAGCA ACCTCAAGAT 4560 

TTCTTTACTA ATCTCACTGA CTTGTTCACC AGCCGCATCA TAGATGGTAA AGGTTTTAGG 4 620 

AATCTTAAAA AATGATCCCT CCACCTGATA GGCAATTTCT CCCCTGTCAT CCTTGATAGC 4 680 

GAAGCGTTCG CCTCCAAGAC GAAACTTTTG TTTGACAAGA AATGTTTTCA TCAACACCTC 4740 
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CAAAAATCAA AAGACAAGCT CATATCACGA AGGGCGAAAA ACCGCGGTAC CACCTTCATT 4800 

CAATGAACTT GTCATTCTCT TGTTCTTATG CAATTGTATG ATTGAGTAGC ATGACTTCCT 4860 

AGCTTAGATG GCTCGCAGCA CCGCCATTTC TCTGGACTAA GACAAGTGAA AATCAATTCT 4920 

CAACTTTCTT ATTATAACGT TTTTTTAAGC TTGCGTCAAC TGCAAATGAT CTCCGTTGAA 4980 

TTAGACCAAT TCCCTACATC TCTGATTACT TTTTCAGGAT ATATTTTTTC TTACTGCCAT 5040 

TTTTCTTTTT ATCCCAAATT TTCATATTAC TAAACACAGC TACTAGAATA TTTCCAAATA 5100 

TAAAGGTGCC TATCACCCAA TATATGGACT CACTTGTTAG GTATTCTCCA TCCAAGCCAT 5160 

CCTTTAAATG GAATAGTATA GCAGTTTGGT TAACAATCAT AAAGGTTGGC CAGAAACTTT 5220 

TTTTGAAAAA AGTAGACATT TTCATTATTT GTTGCCGCTT TCTGTAAGGT TAATACTCAA 5280 

TAAAAATCAA AAAGCAAACT AGGAAGCTAG CCTCAAGCTG TACTTGAGTA CGGCAAGGCA 5340 

ACGCTGACGT GCTTTGAAGA GTATAGGCTT AGTATACTAC TAGGCAAGCA AATAAACAAA 5400 

TAAACAACTA GAATAGAAAA AGATAGGGCT CTAAAAACTG ACTTCTATTC CTTAAAAACG 5460 

AACCAGCTTG ACTGATTCGT CTTCTTACGT TTATCTCCTA CTTCCGATAC ATTTTAAACT 5520 

GTAGGAAGAG GTCGCTATAT TTCCCTGTCC ATTTATCCTC AAATTTCTCA TAAACTTCTA 5580 

GGTGTTTCAT GGTTTCAACA TCGGGATAGA AGGCCTTATC TTCCTTTGTT TCCTCTGGGA 5640 

GCAATTCCTT CGCTGGTAGG TTTGGTGTTG AATAGCCGAC ATACTCCGCA TTTTGGAGAG 5700 

CATTTTCAGG TTTCAACATA AAGTTGATAA AGGCATAGGC TGAGTTTTCG TTTTTAACTG 5760 

TTTTGGGAAT GACCATATTG TCAAACCAAA GATTGCTGGC CTCTGTCGGT ACCACATAAC 5820 

GTAGATTTTC ATTTTTTTCT AACATTTGGC TGGCTTCACC AGAGAACGTC ACGCCGATTG 5880 

CAACATTATT CTOAATCATA TACCCCTTCA TCTCGTCCGC AACGATAGCC TTGATATTTG 59 40 

GAGTCAGTTT GTAGAGCTTA TCCACTGTCT CTTCCAACTG CTGCAGATCC TTGGAGTTCA 6000 

GGCTGTAGCC GAGGGAATTG AGTCCTAGTC CCAGCACCTC ACGCGCCCCA TCAAAGAGCA 6060 

TGATAGAATT CTTATACTCC GGCTTCCAAA GGTCATCCCA ATGCTCAGCC GCTTCATCTA 612 0 

CCATGGTTTC GTTGTAGACA ATTCCTAAGG TTCCCCAGAA GTAAGGGATG CAGAATTTAT 6180 

TACCTGGGTC AAAGGACTGG TTGAGAAACT CTGGTCCGAT ATTTTCGATT CCTTCAATTT 624 0 

TTGAATAATC AAGCGGAACC AAGAGGTCTT CGTCCTTCAT CTTGTTAATC ATGTATTCAC 6300 

TTGGAATGGC AATATCGTAG GTCGTTCCAC CCTGCTTTAT CTTAGTGTAC ATGGCTTCGT 6360 

TGGAGTCAAA AGTCTCGTAC TGAACTTGAA TTCCTGTTTC TTCTGTAAAC TGAGTCAAGA 6420 

GTTCAGGATC GATATAGTCT CCCCAGTTAT AGATAACCAA TTTTTGACTA TCTCGACTAT 6480 
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TGATTTTACT ATCTAAATGA GTCGCAATTC CCCACAAGAC AAGGATAATC GCTGCAATTC 

CTGCTAAAAA TGAATAGATT TTTTTCATGC TTGCTCCTCC TTCTCACGAG AGATAAAGTA 

ATAACCTACA ACTAGGATAA TACTAAAGAG AAAGACTAGA GCAGACAGGG CATTGATTTC 

TAAGGAAATC CCCTTGCGAG CACGAGAGTA AATCTCGACT GATAGGGTTG AAAAGCCATT 

TCCTGTTACA AAGAAGGTCA CGGCAAAGTC ATCTAACGAA TAGGTGAAGG CCATGAAATA 

ACCAGTAATG ATAGACGGAG TCAGGTAAGG AAGCATGATT TCCTTGAACA TCTGAAATTG 

ACTAGCTCCC AAGTCATAGG CCGCATGAAT CATGTCGCCA TTCATTTCCT TGAGTCGAGG 

CAAGACCATC AAGACCACGA TAGGAATGGA GAAGGCCACG TGACTAGATA GAACGGTCAA 

AAAGCCAAGT GAAAACTTGA GTTGGGTAAA GAGAATCAAG AAGCTACCAC CAATCATAAC 

GTCAGGCGCA ACCATGAGGA TATTATTGAG TGATAGAAAG GCTTCTTCCT ATTTCTTACG 

AGACTGGTAG ATGTAAATGG CACCAAAAGT CCCGATAATG GTCGCTATCA AGGCTGATAG 

GAAGGCCAAG AAAAATGTCT GAGCCAAAAT CAGCATGAGT CTCCCATCTC CAAACATGGT 

TTCAAAGTGA GTCCAGCTAA AACCTGTAAA GCTATTCATA TCATCACCAG CATTAAAGGC 

ATAGCCAATC AAGTAAAAGA TAGGCAGGTA GAGGACCAGA AAGACCAGTC CCAGATAAAG 

GTTGGCAAAT TTTTTCATCG TTCTCTCCTT TCCTTAGTCA CCCACATGGT GATGAACATG 

GTCAGGATGA GAATCACACC GATGGTTGAA CCCATACCAT AGTTGTCATT GCTTAGAAAA 

TTCTGCTCAA TAGCCGTCCC CAAGGTGATA ACGCGTTCCC ACCAATCAAA CGGGTCAGCA 

TGAAGAGACT CAAACTTGGG ATAAAGACCG ACTGAACCCC GG 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS : 

(A> LENGTH: 9223 base pairs 
(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 



6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7542 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
AAAACCAAAT TCCGGTATTT TAACCTATGC TGTAAATACC ATGAAGTCTG 



TCATGACAGA 
AAGCTAGCCA 
AAAACAAGGA 
TATCCGTCAT 

TCCTGATTTA TTAAAAGGGG ATGTTGTCGC AAATGATAAT ATCGAGTATA TCAAAGCGCG 

ATAAATCAGA 



TCAGGTCTAT AACATTAAGG TTGAGACAGA AAATGGAAAT TATGTTGGTG 
TGTTTTGGTC CTTTTGACAA ATTACTTCGC TGATAAGAAA ATCTTTGAAG 
CGGCTATGCC AACATTTTGA TTCTGAAAGA TGCCTCTATA TTCTCCAAAT 



TAATATTAAA ATCTCTTCAG ATAGTGAATT GGAGTCAGAT GTTGACGGAG 



60 
120 
180 
240 
300 
360 
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TAACCTACCT 


GTAGAAATCA 


AAGTCCTAGC 


TCAGCGAGTA 


GAAGTATTTT 


CAAAACCGAA 


420 


AGAGGATTAG 


TATATAGAGA 


AAGCCTTTTT 


TAAGGCTTTT 


TGTATACTTT 


AAAAGATAGT 


480 


TCCTTTAACA 


ACGGACATTC 


CTTGCAAATA 


GTTTTACAAA 


AATAGTATAC 


TGGATTCATT 


540 


GAGTTTGAAA 


ACGTTTGCGT 


AAAATTTGAA 


TGAATACTTT 


AGGAGACAAA 


TTGATGGAAT 


600 


TGAGTGCTAT 


TTACCATAGG 


CCTGAGTCGG 


AGTATGACTA 


TCTTTATAAG 


GATAAGAAAC 


660 


TCCATATTCG 


AATTCGAACT 


AAGAAAGGGG 


ACATTGAAAG 


CATCAACTTG 


CACTATGGGG 


720 


ACCCTTTTAT 


CTTTATGGAG 


GAGTTTTATC 


AGGATACAAA 


AGAAATGGTC 


AAGATAACTT 


780 


CTGGTACCTT 


ATTTGACCAT 


TGGCAGGTTG 


AAGTGTCAGT 


TGACTTTGCA 


CCTATCCAGT 


840 


ATCTCTTTGA 


GCTCAGAGAT 


ACAGAAGGTC 


AAAATATTTT 


GTATGGCGAT 


AAAGGGTGTG 


900 


TGGAAAATTC 


TCTAGAAAAT 


CTTCATGCAA 


TTGGGAATCG 


ATTTAAGTTG 


CCTTAGCTTC 


960 


ATGAGATTGA 


TGCCTGCAAG 


gTTCCTGACT 


GGGTTTCAAA 


TACGGTATGG 


TATCAGATAT 


1020 


TTCCTGAAAC 


ATTTGCCAAT 


GGCAATGCTC 


TATTAAACCC 


AGAAGGGACT 


TTAGACTGGG 


1080 


ATTCATCTGT 


CACACCTAAG 


AGCGATGATT 


TCTTTGGTGG 


TGATTTACAG 


GGGATTATTG 


1140 


ATCATATGAA 


TTACTTGCAA 


GACTTGGGTA 


TTACTGCACT 


ATATCTTTGT 


CCCATCTTTG 


1200 


AATCTACAAG 


CAATCACAAG 


TACAATACGA 


CAGATTACTT 


TGAAATTGAC 


CGTCATTTTG 


1260 


GAGACAAGGA 


GACCTTTCGG 


GAACTGGTGG 


ATCAAGCGCA 


TCATCGTGGC 


ATGAAAGTCA 


1320 


TGCTGGATGC 


GGTATTTAAT 


CATATTGGTT 


CGCAATCTCT 


TCAATGGAAA 


AATGTCGTCA 


1380 


AAAATGGTGA 


ACACTCTGCT 


TATAAGGATT 


CGTTCCATAT 


TCAACAATTC 


CCAGTGACAA 


1440 


CTGAAAAGCT 


AGTTAATAAG 


AGAGACTTAC 


CCTATCATCT 


TTTTGGTTTC 


GAGGACTATA 


1500 


TCCCTAACCT 


AAATACAGCC 


AATCCACAGG 


TCAAGAATTA 


TCTTTTAAAG 


GTTGCCACTT 


1560 


ATTGGATTGA 


AGAGTTTAAT 


ATCGATGCTT 


GGCCTTTGGA 


TGTGGCTAAT 


GAGATTGACC 


1620 


ATCAGTTCTG 


GAAGGATTTT 


CGTAAGGCAC 


TTTTAGCTAA 


AAATCCTGAT 


CTTTATATCC 


1680 


TAGGAGAAGT 


CTGGCATACA 


TCTCAGCCTT 


CGCTAAATGG 


AGATGAGTTC 


CATGCCGTCA 


1740 


TGAATTATCC 


TTTATCTGAT 


AGTATCAAGG 


ACTATTTCTT 


ACGAGGAATT 


AAGAAGACAG 


1800 


ACCAG'ITCAT 


CGATGAAATC 


AATGGAGAGT 


CTATGTATTA 


CAAGCAGCAG 


ATTTCAGAGG 


1860 


TCATGTTTAA 


TCTCTTGGAT 


TCACATGATA 


CAGAGCGAAT 


CCTGTGGACG 


GCCAATGAAG 


1920 


ATCTTCAACT 


GGTTAAATCA 


GCCTTAGCCT 


TTCTCTTTTT 


ACAAAAAGGA 


ACACCGTGCA 


1980 


TTTATTACGG 


AACCGAGCTA 


GCCTTGACTG 


GAGGACCAGA 


TCCAGATTGT 


CCTCGTTGTA 


2040 


TGCCTTGGGA 


ACGTGTATCA 


AGTGACAATG 


ATATGCTGAA 


CTTTATGAAG 


AGGCTGATTA 


2100 
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AAATTCGGAA ATACGCGTCA GTAATCATTT CGCATGGCAA GTATAGCCTT CAAGAAATCA 2160 

ACTCTGATCT AGTAGCTCTG GAATGGAAAT ACGAAGGACG GATCCTCAAA GCAATATTCA 2 220 

ACCAATCAAC AGAAGATTAT CTTTTAGAGA AAGAAGCAGT AGCACTAGCA AGCAATTGCC 2280 

AAGAATTGGA TAATCAGCTT GTCATCTCTC CAGATGGATT TATGATTTTC TAAAAACTAG 2340 

TTGATGAAGA TTATGGTACA TTTCATACCT TATATAGTAT AATAAGGCTA GTTACTAAAC 2400 

TTGTAAAGGA GAACTTAAAT GAATTGTAGA GGACATGAAA CAAGACAAAG AATTGTTAGA 2460 

GATTTTGAAG TTCAGCCTAA AGCACATATT AAGCTGTTAG CAAATCAACA AAAACATAGT 2520 

GATGCAGGAG CAACTATTGA AGATGAATAT TATGTATTTA TCGCTGAGAG TAAAATTGAT 2580 

GGCAAGAAGG AAGTTATTCA GTGTTGCATG GGTGCGGCAA GGGATTTTTT AGAACTAATT 2640 

AATCACAAAG GGCTACCTCT TTTTAATCCG CTTGTAGGTG ATTCTCATGT AAATAATAGA 2700 

CAAGAATATG ACAATACAGG GAGTGGAAAT TTATAACCTG AAAAGTGGAA TGAAACTCCA 27 60 

AAGCAGCTTT ATAATGCTAT AATGTGGTTG ATTATTTTAT GGAATGCTAA GCCGGATACA 2820 

CCTTTATTTA ATTTTAAAGA CGAAGTAATT AAGTATAAAA CATATGAGCC TTTTGAAAGC 2880 

AGTATAAAAA GAGTAAATAC TACTATAAAG AATGGTAGTA AAGGGAAAAC TCTGACTGAC 2940 

ATGATTAATG GCTACAGAGC GGATAACGAT ATTAGAGATG AAATTTGTAA CTTTAATATT 3000 

CTGAAAAATA AAATTCGTGA TATGAAAAAC CAACAAGGAA ATACAATGGA ATCTTACTTT 3060 

TAGTTATTGT TGAATTTTGG GTATTCTATA AAATATCCTA ATTGAGATTT AAATAGTAGA 3120 

CTATACAATA TAGTTAAAAT ATCAGTAAAA ACAACACTTT ATTGAGGTAT TGGATACGCT 3180 

TTGCTAATAG CCTAATAATC ACATCTGGAG TCTTGCTACA ACGAAAAAGG TGATAATCCT 3240 

TGATTTCAAG CTATTTTATA AGCATTTTGT CTTTGTAGAT AAAGGCAAT? TTGACAATAA 3300 

AAATCCTAAA AGCTGAATCG TTATAGATGT ATTTGTAGAT ATCGTTTGCG CATCGAAAAA 3360 

ATTAATACAA GAATAAATAT TTATAGCTCT TTAGGTGACT TTTATAGAAG TAAAGTTTAC 34 20 

GATAGAAAAA CAAGAAATAA CGCACCATTT TTGGTGCGTT ATGCTTTTTT ATGCTATAAT 3 4 BO 

GCATTTATAA AAATAAAGGA GTTTGCTATG ATTGGAAAGA ACATAAAATC CTTGCGTAAA 354 0 

ACACATGACT TAACACAACT CGAATTTGCA CGGATTGTAG GTATTTCACG AAATAGTCTG 3 600 

AGTCGTTATG AAAATGGAAC GAGTTCAGTC TCTACCGAAT TAATAGACAT CATTTGTCAG 3660 

AAGTTTAATG TATCTTATGT CGATATTGTA GGAGAAGATA AAATGCTCAA TCCTGTTGAA 3720 

GATTATGAAT TGACTTTAAA AATTGAAATT GTGAAAGAAA GAGGTGCTAA TCTATTATCT 37 80 

CGACTCTATC GTTATCAAGA TAGTCAGGGA ATT AG C ATT G ATGATGAGTC TAATCCTTGG 3840 

ATTTTAATGA GTGATGATCT ATCTGATTTG ATTCATACGA ATATCTATCT AGTAGAAACT 3900 
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TTTGATGAAA TAGAGACATA TAGTGGCTAT TTGGATGGAA TTGAACGTAT GTTAGAGATA 3960 

TCTGAAAAAC GGATGGTGGC CTAATGGAAA TCCAAGATTA TACTGATAGT GAATTCAAAC 4020 

ATCCTTTAGC AAGGAATCTT CGTTCACTGA CAAGAGGAAA AAAGTCCAGT AAGCAACCTA 4080 

TAGCGATTTT GCTTGGAGGG CAAAGTGCTG CCGGTAAGAC TACAATTCAT CGTATTAAAC 4140 

AGAAAGAATT TCAAGGAAAT ATTGTTATCA TAGATGGTGA TAGTTTTCGT TCTCAGCATC 4200 

CACACTATTT AGAACTGCAG CAAGAATATG GCAAAGACAG TGTAGAATAT ACCAAAGATT 4260 

TTGCAGGAAA AATGGTAGAG TCTTTAGTAA CAAAATTGAG TAGTTTGAGA TACAATCTTT 4 320 

TGATAGAGGG AACTTTACGA ACAGTTGATG TTCCAAAGAA AACAGCACAA CTCTTGAAAA 4 380 

ATAAGGGATA TGAAGTACAA TTGGCCTTAA TTGCGACAAA GCCTGAATTG TCGTATCTAA 4440 

GTACTCTTAT CCGTTATGAA GAACTGTACA TTATCAATCC AAATCAACCA CGCGCAACTC 4500 

CAAAAGAACA TCATGATTTC ATTCTAAATC ATCTAGTTGA TAACACACGA AAATTGGAAG 4560 

AACTAGCTAT CTTTGAAACA ATTCAAATTT ACCAACGAGA TAGAAGTTGT GTATATGATT 4620 

CAAAAGAAAA TACAACTTCA GCAGCAGATG TTCTTCAAGA GTTACTCTTT GGGGACTGGA 4680 

GTCAGGTAGA CAAGGAGATG TTGCAGGTGG GGGAAAAGAG ACTTAATGAA TTACTTGAAA 4740 

AATAAACAAT TGATATTTTT AGGAGAATAG AAATGAGAGG GTTTAATAAC AACATAAAGT 4800 

CTGTTTATCA AGAACTAACA AATTCCAAAC AGAAATTCGG TAGCTTTCAC AAGACTTTAA 4860 

TTCATTTCCA TACACCTGTT TCTTATGATT ACAAGCTATT TTCTAATTGG ACTGCAACGA 4920 

AATATAGAAA AATTACTGAA GATCAACTAT ATGATATATT TTTTGAAAAT AAGAAAATAA 4 980 

AAGTTGATAA GACAATTTTT TTTAGTAATT TTGATAAGGT TGTTTTTTCT AGTTCAAAAG 5040 

AATATATTAC TTTTCTTATG TTAGCACAGG CAATCATAAA AAATGGAATA GAAATAGTTG 5100 

TAGTAACTGA TCATAATACT ACCAAAGGTA TTAAAAAGTT ACAAATGGCA GTCTCAATCA 5160 

TAATGAAAAA TTATCCGATT TATGATATAC ATCCTCATAT TTTACATGGA GTAGAAATTA 5220 

GTGCAGCAGA TAAATTGCAT ATTGTATGTA TATATCATTA TGAACAAGAA TCATCCGTTA 5280 

ATCAATGGTT AAGTGAAAAT ATTATAAGTG AGAAAGATGG AAGTTATCAA CATTCACTGA 5340 

CTATAATGAA GGATTTCAAT AATCAAAAAA TAGTTAACTA TATTGCTCAT TTCAATAGTT 5400 

ATGACATTTT GAAAAAAGGT TCTCACTTAT CAGGTGCATA TAAACGAAAA ATTTTTTCTA 5460 

AAGAAAATAC ACGATTTTGG AGTTTAATAT TAACTCGAAA GAATCTTCGC AACAACTTGA 5520 

TATTCTCTAT AAAGAAGTTG GTGTATTAAG TTTGGGACAA AAAGTTGTAG CCATGCTTGA 5580 

TTTTTTATTA GCATATAGTG ATTATTCTAA AGACTTCAGA CCATTGATTA TTGATCACCC 5640 
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TGAAGACAAT 


CTAGACAATC 


GTTATATTTA 


CAGGCATTTA 


GTTCAGCAGT 


TTAGAGATGT 


5700 


GAAAGCTCAA 


CGTCAAATTA 


TTTTAGCAAC 


ACATAATGCT 


ACAATTGTAA 


CAAATTCTAT 


5760 




G TTGTT & TT & 


TGGAGTCAGA 


TGGAGTTAAC 


GGATGGATTG 


AATCACAGGG 


5820 


nln tu 1 1 nVi 1 


GAAAAATATA 


TAAAAAATCA 


TATCATCAAT 


CAATTAGAGG 


GAGGAAAAGA 


5880 


TT/'VTTr 1 a a n 




PTATATATGA 
l i n i n ink 


GACGGCTTTA 


TCAGAGTAGA 


GTCAGAAAAA 


5940 


W 1 nU(j 1 1 Aun 


A/\ ill >\vjl_ l_ 1 


1 i 1 1 HL 1 


TTGTCCGACA 


GGCATAGTGT 


ACATCTGAGG 


6000 


AAt> i V.L 1 






TGAAACCAAT 


AGCGACTCCT 


AAGCCTGAAT 


6060 


A 1 V_lj iTjAtjLt i 


ALtuvjVjOviA 1 A 


t^GlLiGGAaTT 


AGCGAAATCA 


AGGTTCTACA 


AACAGAATCG 


6120 


TGACTTGAAG 


A 1 A 1 A i Au 


rrtn a t n Afift A 

^.uOn 1 u/\uu>/\ 


Ar*Tf*TAAAAT 


CC AAAT AGGT 


GTCGTAACCT 


6180 


ATATACGTAA 


A* iA(«AjAL»A(j 


1 AAAL 1 AL*k*/\ 


AAGft.TGTAGG 




TGAGCGTTTA 


6240 


GGACGTAGTA 


CAACGAATCA 




CTfl AA f AG AT 


AGTATTGAAG 


AAATTTCTGT 


6300 


AATGGAAATG 


GAGCGAAGAA 


(j I VjAALAA 1 i 


A A A*TYt A AT AG 
AAA lunn 1 r\\- 


rTGTGTAATT 


AAATTTGTCA 

r^£^r\ i 4 A u ■ X*>TA 


6360 


ATTCTAATTC 


irrWjTA 1 GAA 


AAuALnu 1 via 


ffTGAAAATG 


TAAAGGATGG 


GAGCTGATCA 


6420 


TAAATATAGG 


ft. /"V*/*"!* R X TV* 

ACGGTACATG 


(.At* I yi\* 1\> 1 I 


AflAHATTAGT 


GGTTACTTGA 


TTTGTGATAA 


6480 


CTTCCCCAAA 


TTTCTTCTGC 


m«mft /""MPTT*/* 

TATALi ri i ^_ 


t V_/v\v» 1 1 i In 


AAAATPGAAC 


TAAGAATTTT 


6540 


ACCTGGGGGT 




AIjLALi I AAU 1 


TATGTTATCG 


TTAGGTGTCA 


AAACTGGTAG 


6600 


GTTTTGATAG 


GCTGGCGAI A 


LajAi 111 lbO 


flATATTGTGG 


ACACAATATC 


TGAGCTCGCA 

A tt^ftT^i^f^v A ^ftw-XftV^^v * 


6660 


AAGCCTTACA 


% ft ft ft fc ft t 

AGAATGAAAA 




ftAAA AGTGTA 


GTGACATTGT 


ATGGTAGCTC 


6720 


A^- A I i L> I (_At* 


1 AAL» I A 1 1 


T t^t: ft A A GG JV 


ACT AG C AG^ A 


TGAAACGAGA 


TGTGCGTGAT 


6780 


AT Tl_GC»AAAt. 


AA I t 1 IlI 




GAAGAAAAGC 


AAATTCTAGC 


TTTCATGAGA 


6840 


G AGCGGGv»r»o 


AuALiAAi I 1 


Vjrt 1111 




GTTTACTTTC 


CTCTGATTTA 


6900 


t_ AAAAAL Avj a 




GTTTGi'*r*G , T*G 


TGGCAATCCC 


AAAAACTAGA 


ACAAATCAGT 


6960 


CGTGACGTTC 


ATGAAGTTTT 


AATCTTGGCA 


CAGTCAGAAC 


GTCAAGTCAC 


CCAAGAGCAT 


7020 


GTATCTATTC 


TCTTAACGTG 


CGTGCAGGAA 


TTGATTCAAG 


AGGTTGCAAA 


CACCATACCC 


7080 


CTCAGTAAAG 


AATTTCGTGA 


GAAGTACATG 


AGGTAAGCAC 


ATGGAACATC 


GTTACCGAAC 


7140 


CAATCTCAAG 


AAAGTGTTTT 


TGTCTGATAG 


TGAGTTGAAC 


CAACTAAATA 


TAAATATCGA 


7200 


TCAAAGTGGT 


TGTAAATCCT 


TTTCTGAATA 


TCCGAGACCA 


ACTCTACTCG 


ATCCTGGTAT 


7260 


GAATTTTATC 


ACGATTGACA 


CAAACGGTTA 


CCAAGATTTA 


GTGTTTGAGT 


TAAAGAGGAT 


7320 


TGGCAATAAT 


ATCAACCAGA 


TTGCTCGAAG 


TGTTAATCAA 


TCTCAGTTAA 


TTTCTCGTGA 


7380 


AGAATTGCAG 


GAGTTGAAAA 


AAGGAATTGG 


TGAATTGATA 


AAAGAAGTTG 


ATAAGGAATT 


7440 
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TAATCTGCAA GCGCAGAAGC TAAAGGAGTT CCATGGTCAT CACTAAACAC TTTGCCATTC 7 500 

ACGGAAAGAG TTACCGCAGA AAGCTTATCA AGTACATTCT CAATCCTGAG AAAACCAATA 7560 

ATCTTGCCTT GGTGTCGGAC TATGGCATGA AGAATTTTCT GGACTTTCCT AGCTATGAGG 7620 

AAATGGTGCA GATGTATCAT GAAAATTTCA TCAGCAACGA TACGCTTTAC GATTTTCCCC 7680 

ACGACAGGAT GGAAGAAAAT CAACGAAAAA TACACGCTCA CCACATCATT CAGTCTTTCT 7740 

CGCCAGAGGA TCATATCACT CCTGAACAAA TCAATCGGAT AGGTTATGAG ACTGTGAAGG 7800 

AATTAACTGG TGGCAAATTT CGTTTTATCG TTGCGACCCA TGTTGATAAA GACCACCTGC 78 60 

ACAATCACAT CATTATCAAT TCAGTAGATA GCAATTCTGA CAAAAAGCTC AAGTGGGACT 7920 

ACAAGGTGGA GCGAAATCTT CGCATGATTT CTGACCGTTT TTCTAAAATC GCAGGTGCTA 7980 

AAATCATTGA GAACCGCTAT TCTCACCAGC GGTATGAAGT CTATCGTAAG ACTAATCACA 8040 

AGTATGAACT CAAGCAGCGA CTCTATTTTT TGATCGAACA TTCTAGGGAC TTTGAGGATT 8100 

TCAAAAAGAA TGCTCCGCTA CTACATGTGG AGATGGATTT CCGTCACAAG CATGCCACCT 8160 

TTTTTATTAC GGACTCAACT ATGAAACAGG TCCTGCGTGG CAAGCAACTC AATCGCAACC 8220 

AGCCTTACAC AGAAGAATTT TTTAAGAACT ACTTTGCCAA AAGAGAAATA GAAAGTCTCA 8280 

TGGAATTTTT ATTGCTGAAA GTTGAGAATA TGGATGATTT ACTTCAGAAA GCAAAACTTT 6340 

TTGGACTAAC TATCAATCCT AAACAAAAGC ATGTTTCTTT TCAATTTGCA GGAGTGGAGC 8400 

TAAAGGAGAC ACAGCTAGAC CAGAAAAATC TTTATGATGT ACAGTTTTTC CAAGATTATT 8460 

TTAAAAATAG AAAAGATTGG CAAGCTCCAG AAACTGAGGA TTTCGTTCAA CTTTATCAAG 8520 

AAGAAAAGTT ATCCAAAGAA AAAGAACTTC CAAGCGATGA GAAGTTCTGG GAGTCCTATC 8580 

AAGAGTTCAA GAGTAACAGA GATGCCGTTC ATGAATTTGA GGTGGAGTTG TCACTCAATC 8640 

AAATTGAAAA AGTAGTGCAT GATGGAATTT ACGTCAAGGT CAAGTTTGGT ATTCGTCAGG 8700 

AGGGACTTAT CTTTCTGCCG AACATGCAGC TTGATATGGA AGAGGATAAG GTGAAGCTTT 8760 

TCATCAGGGA AACCAGCTCC TACTATGTCT ACCACAAAGA CGCTGCCGAG AAAAATTGTT 8820 

ATATGAAAGG TCGAACCTTA ATTAGACAGT TCAGCTATGA AAATCAAACC ATTCCATTAC 8880 

GCAGAAAAGC GACAGTCGAT ATGATTAAAG AGAAGATTGC GGAAGTGGAT CCTTTGATTG 8940 

AACTGGAAGT AGAAAATCAA TCTTATGTCA CGATTAAAGA TGAGTTAGTG CATGAACTAG 9000 

CACCGTCTGA ATTGAGAATC AATGAGTTGC AAGAACGAAT GTCAACCTTG AATCAAGTAG 9060 

CAGAATATCT ACTGGCTTCA GTTGAAAGTA AGCAAGAAAT GAAATTAAAT CTTTCAAAAC 9120 

TGAATATAAC TGAGAATATC AGTGCTAATA TTGTTGAGAA AAAATTGAAG AGCCTGGGGA 9180 
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ATCAACTGGA ATTGGAAAGG GGCAGGTATG AAAAGATGGT AGT 9223 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6827 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

TCTGCTGGCT ACCATCATCT GACTTGGGCA AGACCAAAGT CTTAGTTACA ACTGTATTCT 60 

TCTCAGCATT TTCAATAACT GGCAATGCCG ACTGAAGCGT ATCTTTTTCT GTTTTTGTAG 120 

CTGGTCCAGT TTCTTTTTTC TGTCCGCAAC CAACCAGGAC AAAAAGGAAA GCTAGACTAA 180 

CAAGAACTAT TTTT TT CATT TCTTTCTTCT TTCTTTTTGA AATTAAAATA GAATAAGACT 240 

GGGAAGTGCT CCCAGCCTTG ATGTTTATAG AGCTGCACGC AAACGTGCTT CTGCATTTTC 300 

TACATTACGG ACAGAGCGTG GTAGGAAGGC ACGAATATCG TCTTCCTTGT AGCCAACTTG 360 

CAGGCGTTTT TCATCTACAA GGATTGGGCT CTTTAAAATT CTCGGTGTTT CCATAATCAG 420 

ATTGAGAACT TCATTGACAC TCAAATCTTC AATATCCACT CCAAGGGCTT TGGCATAGCG 4 80 

ATTTTTAGAC GAAACGATGC TGGCTATTCC GTTATCTGTT TTGGTTAGAA TATCCAGTAA 540 

TTCTTCTCTC GTAATTCCTT CTTTACCAAG G TTTT G TTCT TTATAACTTA ACTGGTGGGC 600 

ATTGAGCCAG GTTTTTGCTT TTTTACAGCT AGTACAACTT GAGACTGTAT AAATTTTAAT 660 

CATGTACCTA CCCCTTTCCC TACATGTTAC TATCAGTTTA GTCTATTATA CCATAAAAAA 720 

CATCCGACTT GCGACCTATT TTTAATTTTT TTTGACTTTT TTCGTCATTT TCGTACTTTT 780 

TTCTTGACAA ACAACTAAAT GACTATCAAC TCTTT7CGAG CTAGGGTCAA TAATTCACAA 840 

CCTGTCTCTG TAATCAGGAT ATCATCCTCG ATACGAACGC CATATTTGCC TTCGATATAG 900 

ATACCTGGTT CATCGGTCAA GGCCATACCT GTCTTAATAG TTTCTGTAGA AGTCTGACTA 960 

AAGTAGGGTT CCTCATGGAT ATCCAGACCA ATACCGTGGC CAATGCCGTG AGTAAAGTAG 1020 

TCACCATAAC CTGCCTCAAT GATAATATCA CGAGGGATTT TGTCAAAGTC ACGGAAACCT 1080 

AAGCCTGCCT TAGCTTGGTC AATCAAGGCT TGGTTAGCTT TTAGAACCGT ATTGTAAATC 1140 

TCTGCCTGCT CATCGCTAAC ATGCCCTAGA TAGATAGTCC GGGTCATATC ACTGACATAG 1200 

TGGTCATAGA GACAGCCGAA GTCCATGGTG ATGGCTTCTC CCAACTCCAC TGGTTTGTGC 1260 

ATTGGATGGG CATGGGGTTT AGAAGAATTG ATACCGCTAG CTAGGATCGT ATCAAAAGAT 1320 

AAGCCAGATG CTCCCAACTC ACGCATGCGG AAATCAAGGA AGTTGGCAAT CTCAATTTCA 13 80 
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GTTTTTCCTG GTTTGATAAA GTCAAGCGCA TCGCGGAAAG CTTGGTCTGA GATAGAACAA 1440 

GCCTTGCGAA TCGCTGCAAT CTCTGCCTCA TCCTTAATCA TACGAACACC TTCCACAAAC 1500 

TGAGTTTGTG GAAGCAAGTT CAAACCTGCA AAAGCTGCCT GCATACGGTG GTAATAAGAC 1560 

ACTGAAATCT CATCTTCAAA ACCGATACGA GTCAAGCCCA TGTCCTTAAC AATTCCTGCA 1520 

ATGACAGCCA ATT CATC ACG ATCAGCCACA ATCTCAAAAC CACTGGTTTC TTGCTTAGCT 1680 

GCGATGATAT AGCGAGAGTC TGTCACTAAG ACCTGACGGT CACGACTGAT AAAGACTGTT 1740 

CCGTTTGAGC CCCAAAAACC AGTCAAATAA TAGACGTTTT TAAGATTGTT GATGATGATA 1800 

CCATCTAGTT CTTTTTCTTG CATTTTAGCT AGAAATGCTT GTACGCGTTT ATTCATGATG 1860 

TAACTTTCCT TTCAAATAGT GTCCTGTATA GCTGGCTTCG TTGGCAGCTA CTTCTTCTGG 1920 

AGTTCCTGTT ACGATGATGG TTCCACCACC GACACCGCCC TCAGGTCCCA AGTCAATGAT 1980 

ATCGTCTGCC CTCTTGATAA CATCCAGATT GTGCTCGATG ACGAGGACTG TATTGCCATC 2040 

GTCTACAAAG CGAGCTAAAA CCTTGAGCAG GCGAGCAATG TCCTCTGTAT GAAGCCCTGT 2100 

CGTCGGCTCA TCCAGAATCT AGAAAGATTT TCCTGTCGAT CCTTTGTGGA GTTCGCTAGC 2160 

TAACTTCATA CGTTGGGCfT CTCCCCCAGA AAGGGTGGTA GCTCGCTGTC CCAAGGTCAC 2220 

ATAGCCTAGC CCTACATCCT TGATGGTCTG GAGTTTGCCT TGAATTTTCG GAATGTGTTG 2280 

GAAAAATTCT ACCGCATCGT TGACCGTCAT ATCCAAGACC TGCGAAATAT TCTTTTCCTT 2340 

GTAGTGAACT TCTAGGGTTT CACTGTTATA GCGGGTTCCG TGGCAAACTT CACAAGCCAC 2 400 

ATAAACATCT GGCAAGAAGT GCATCTCAAT CTTGATAATC CCGTCACCTG AGCAAGCTTC 2460 

ACAGCGACCT CCCTTGACGT TGAAACTGAA GCGCCCCTTC TTGTAGCCTC GAATCTTGGC 2520 

TTCATTTGTC TGAGCAAAAA GG7CACGTAT ATCGTCAAAA ACTCCTGTAT AGGTAGCTGG 2580 

GTTAGACCTC GGCGTCCGTC CGATAGGGCT CTGGTCAATA TCAATCAAAC GGTCGACATG 2 640 

CTCAATCCCT GTAATAGTCT TAAACTTACC AGGTTTGTCT GAATTACGGT TGAGCTTCTG 2 700 

GGCAATGGCT TTTTTGAGAA TGCTGTTGAT TAGAGTCGAT TTCCCTGAAC CCGACACACC 2760 

TCTCACTGCG ATAAATTTTC CTAGTGGAAA GCGAGCCGTG ACATTTTGCA AGTTGTTCTC 2820 

ACGCGCTCCT ATCACTTCAA TAAAACGACC ATTTCCGACA CGGCGCTCTT CTGGTACTGG 2880 

GATGACACGT TTGCCTGACA AGTACTGACC TGTGATAGAC TTGCTGTTGC GAGCCACTTG 2940 

CTTAGGTGTA CCTGCTGCAA CAATCTCACC ACCAAAAACA CCGCCACCAG GACCAACGTC 3000 

AATCAGATAA TCAGCCTCAC GCATGGTATC TTCGTCGTGT TCCACCACGA TAAGAGTATT 3060 

GCCCAAGTCA CGCATCTTTT TCAGACTGGC AATCAGGCCA TCATTGTCCC TCTGGTGAAG 3120 
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ACCGATTGAC GGCTCGTCTA GGATATAGAG GACACCTGAT AGCTTGGAAC CAATCTGGGT 3180 

TGCCAAACGA ATGCGCTGAC TTTCCCCACC TGAAAGGGTT CCTGCTGAAC CTGACAGGGT 3240 

TAGATAGTTA AGACCCACAT TATTAAGGAA GGTCAAACGA TCCTTGATTT CCTTGAGAAT 3300 

GGGACGAGCA ATGATGGCTT CATTTTCAGA CAAAGTTAAC TGGCTCACCA AGTCCAAGTG 3360 

GTCAGCGATA GACAGGTCTG AGATTTCTCC AATATGTGGC CCTTGCTGGC CGCCCACACG 3420 

GACAGACAAG GCCTGGTCAT TGAGACGATA GCCTTGACAG GTTCCGCAGG TCAGCTCATT 3 4 BO 

CATGTAGAGA CGCATCTGAG TGCGAGTGTA ATCGCTATTG GTTTCATGGT AACGACGTTT 3 540 

GATATTATTG ATAACTCCCT CAAACGGAAT GTCGATATCG CGCACGCCAC CAAATTCATT 3 600 

CTCATAGTGG AAATGGAATT CCTTACCATC TGACCCATAG AGAATCAAGT TCTTATCTTC 3660 

TTCTGACAGG TCCTCAAAAG GCTTATCCAT AGCCACTCCA AAGACTTTCA TGGCCTGCTC 3720 

TAACATGTTT GGATAGTAGT TGGATGAGAT AGGATTCCAA GGTGCTAGCG CTCCCTCACG 37 80 

TAAGGTTTTG CTAGCATCTG GCACTACCAA ATCAGTATCC ACCTCCAGCT TGATGCCCAA 3840 

GCCGTCACAC TCACTACAAG AGCCAAAAGG AGCATTGAAA GAAAAGAGAC GAGGCTCTAA 3900 

CTCTGGGACA GTAAAACCAC AAACTGGACA CGCATAATGC TCAGAGAACA ACAACTCCGA 3960 

GTCGTCCATG GTGTCGATAA TGACATAACC TTCTGCAATA CGAAGGGCAG CCTCAATGGA 4020 

ATCAAAGAGA CGACTACGAA TGCCCTCCTT GATAACAATA CGGTCAACCA CGACATCGAT 4080 

ATTGTGTTGC TTGCTCTTAG ACAACTCTGG CACTTCGGTC ACATCATAGA CTTCCCCATC 4140 

CACACGGACA CGAACATACC CGTCTTTCTG AACCTTCTCG ATAACACTCT TATGTTGGCC 4200 

TTTTTTCTTG CGGATGACAG GAGCCAAGAT CTGCAAGCGC TGGCGTTCAG GTAACTCCAA 42 60 

AACCTTATCA ACGATTTGCT CCACAGAAGA AGCATTGATA GCTCCATGTC CGTTGATACA 4320 

GTAAGGCGTC CCCACACGTG CGTAGAGGAG ACGCAGATAG TCATTGATTT CAGTCGTCGT 4 3 80 

TCCCACCGTC GAGCGAGGAT TTTTACTAGT CGTTTTCTGG TCGATGGAAA TAGCTGGGCT 4440 

GAGACCATCA ATGGCATCTA CATCTGGTTT TTCCATATTT CCCAAGAACT GACGAGCGTA 4500 

GGCGGACAAA CTCTCTACAT AGCGACGTTG TCCCTCCGCA TAGAGAGTAT CAAAAG CC AG 4560 

ACTGGACTTC CCTGAACCTG ACAAGCCAGT CACGACAACC AACTTGTCTC GCGGAATCTC 4620 

CACATCAATA TTTTTTAAAT TATGGGCACG CGCCCCATGA ATGACAATTT TATCTTGCAT 4680 

CTTTGTTCTT TCTAGTCCAT TATTGCTTAC CATTATACCA AAAAAAGTGA GATTCTATTA 4740 

CCCAAAAGGC CCATTTTGTA GTATAATAGT ACAGTGTGAA AAAATCTGAA AAATGAGAAA 4800 

GGATAAGGGA TATGAAACAA GTTTTTCTCT CTACAACAAC TGAATTTAAA GAGATCGATA 48 60 

CGCTTGAACC GGGTACTTGG ATCAATCTCG TCAATCCGAC TCAAAATGAA TCACTCGAAA 4920 
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TCGCCAACAC CTTCGATATT GATATTGCTG ACCTTCGAGC ACCGCTCGAT GCGGAAGAAA 4980 

TGTCTCGTAT TACCATTGAA GACGAGTATA CCCTGATTAT CGTAGACGTG CCGGTCACGG 5040 

AGGAAAGAAA TAACCGCACC TACTACGTAA CCATCCCGCT TGGTATTATC ATCACTGAGG 5100 

AAACCATTAT CACTACGTGT TTGGAACCAC TACCTGTCCT TGATGTCTTT ATCAACCCTC 5160 

GATTGCGTAA TTTCTATACC TTCATGCGTT CACGTTTTAT CTTTCAAATT CTTTATCGCA 5220 

ATGCAGAGCT TTACCTAACA GCCCTTCGTT CAATCGACCG CAAGAGTGAA CAAATCGAAA 5280 

GTCAACTGCA TCAATCAACT CGTAATGAAG AATTGATTGA GCTCATGGAA TTGGAAAAAA 5340 

CTATCGTCTA TTTCAAGGCC TCCCTCAAAA CAAATGAGCG CGTGATTAAG AAATTGACCA 5400 

GTTCAACCAG CAATATCAAG AAATACCTTG AGGACGAAGA CCTGCTTGAA GACACCCTGA 54 60 

TTGAAACCCA ACAGGCCATC GAGATGGCAG ATATTTATGG AAACGTCTTG CATTCTATGA 5520 

CAGAGACCTT TGCCTCTATC ATTTCTAACA ACCAGAACAA CATCATGAAA ACCTTGGCCC 5580 

TTGTGACCAT CGTCATGTCC ATCCCAACCA TGGTCTTTTC TGCCTACGGG ATGAACTTTA 5640 

AGGATAATGA AATCCCCCTA AACGGAGAGC CAAATGCCTT CTGGTTAATC GTCTTTATCG 5700 

CCTTTGCTAT GAGTGTCTCG CTCACTCTCT ATCTCATCCA TAAAAAATGG TTCTAAGAGG 5760 

AGTTCCTATG TCTCAAATTG ATCTACAAAA ATtAACTAAG AAAAACCAAG AGTTTGTCCA 5820 

CATTGCTACC CAACAATTCA TCAAAGATGG CAAAACAGAC CCTGAAATCC AGACTATTTT 5880 

TGAGGAAGTC ATTCCCCAAA TCCTTGAGGA GCAATCTAAA GGTACAACTG CCCGTTCCCT 5940 

ATACGGCGCA CCAACTCATT GGGCTCATAG CTTCACTGTC AAAGAGCAG7 ACGAAAAAGA 6000 

GCATCCAAAA GAAAATGATG ACCCAAAACT GATGATTATG GACTCAGCTC TTTTCATCAC 6060 

TAGCCTCTTT GCCCTTGTCA GCGCCCTCAC AACCTTCTTT CCGGCAGACC AACCTTTCGG 6120 

CTATGGATTG ATTACTCTTC TATTAGTTGG ACTGGTTGGT GGATTTGCCT TCTACTTGAT 6180 

GTACTACTTT GTTTACCAAT ACTATGGACC AGATATGGAT CGCAGTCAAC GTCCACCTTT 6240 

CTGGAAATCT GTACTAGTTA TCCTAGCTTC TATCTTCCTT TGGTTGCTTG TCTTCTTTGC 6300 

AACAAGCTTC CTACCAGCTA GCCTTAACCC AGTACTGGAT CCATTGCCAC TACCTATTAT 6360 

TGGAGCAGCC CTCCTAGCCC TTCGCTTCTA TCTCAAGAAA CGCTTGAATA TCCGTAGTGC 6420 

AAGTGCAGGA CCAACACGCT ATCAAGAATA AGAAAACGAT AAAAGCAACT GCAGGTGCGG 6480 

TTGCTTTTTC ACTTACTTTT TTGAGTTATA TTCAATGAAA ATCAAAGAGC AAACTAGGAA 6540 

GCTAGCTGCA GGTTGCTCAA AGCACAGCTT TGAGGTTGCA GATAAAACTG ACGTGGTTTG 6600 

AAGAGATTTT CGAAGAGTAT TAAAAGTATT CTTCTGAAAT CCCACATAGC TTTCTCTTAT 6660 
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ATTTTGTGAT AAAATAGGCT CAATCTATTT CTAGGAGGAT GAGATATGGT TTCTACTATT 6720 

GGTATTGTTA GTTTATCTAG TGGCATTATC GGAGAGGATT TTGTCAAACA CGAAGTGGAC 67 BO 

TTGGCTATCC AACGTCTCAA GGATCTGGGA CTCAATCCCA TCTTTTT 6827 
(2) INFORMATION FOR SEQ ID NO: 61: 



(i) SEQUENCE CHARACTERISTICS: 

{A> LENGTH: 11864 base pairs 
(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

CTGGCTAGTT GCATACACCA AAGTTGCTTC TTCATCAACA AAACCGTTCA TTTCAAAATA 60 

GGAAAGCAGC TCATCAGGAC TCTCCAAACG AATCCCTTTG TAATCCAGCT CAACTGCCAC 120 

CTCTTTCAAG GCTGCAAGAA GAAGTGTTCC CAGGCCCTGT CTCTGATGGT CAAACTCGAT 180 

GACTAAAGAA TGTACTTTTA GACATTGCGG ATTGTCTGAC TGGGGACTTG ATAAAATATA 240 

GCCTAAAAGT TGATTTTCAT CCCTAGCTAG AAGAAAGGTA TCCGCACACT TACGGATACT 300 

TTCTTCTAAA ATATGGGAAA GTTGCTGCTT TTCAGCTGGA AAAGACGAGG TCTGAAGTGC 360 

CCCTATCTCA GGCAAATCAG ACTTGCTTGC CTGAATGATC TTAATTGGAA TTTCCATGGG 420 

AACATCCTAT TGAACATTGC TTGTCAAGTT AGACAAGAGA CGCTCAAATG AGTATTCATA 480 

GGTTTGGATG TCTCCTGCTC CCATAAAGAC GTAAACAGCA TTGTCATGGT CTAGGAGTGG 540 

AGAAACATTT TCAACAGTAA TCACTTGGTG TTTTTTGTTG ATTTTGTTGG CTAGGTCTTC 600 

TACCTTAACG TCACCATGAT CTACTTCACG AGCCGAGCCA TAAATTTGCG CTAGATAAAC 660 

AGCATCTGCT TGGTTTAAAG CATCGGCAAA GTCGTCCAAC AACGCAATGG TTCTTGTAAA 720 

GGTATGCGGT TGAAAGACTG CTACAATTTC CTTGCTTGGG TATTTCTGAC GAGCCGCATC 7 80 

CAAGGTCGCA ATAATTTCTG TTGGATGGTG GGCAAAGTCA TCGATAATCA CTGTATCATT 840 

GACAATTTTC TCAGTGAAAC GACGTTTAAC ACCGGCAAAT GTTTTCAAGT GCTCACGCAC 900 

CAAGTTCAAA TCAAATCCTG CTGTGTAAAG AAGACCAATA ACGGCTGTCG CATTCATGAT 960 

ATTGTGACGA CCAAAGGTTG GAATGTGGAA TTGCCCCAAG TTTTGTCCAC GGAAATGAAC 1020 

GGTGAAGGTT GAACCAGTTA TTGAACGAAG AAGATCACTA GCTACAAAGT CATTGCCTTC 1080 

AGCTTCAAAA CCATAATAAT AAATTGGTGC ATCAGACGTA ATCTTACGCA ATTCAGCATC 1140 

TTCACCATAG ACAAAAAGAC CCTTGGTGAT TTGTTTGGCA TAGTCGTTAA AGGCATTAAA 1200 

AACATCCTCG AGACTTGTGA AATAATCTGG ATGGTCAAAG TCAATGTTGG TGATAATAGA 1260 



WO 98/18931 



PCIYUS97/19588 



525 



GTATTCTGGG 


TGGTAAGGCA 


TGAAGTGACG 


CTCATATTCG 


TCAGATTCAA 


AGACAAAATA 


1320 


TTTGGCATTG 


GCCGAACCAC 


GACCTGTCCC 


ATCTCCAATC 


AAGAAGCTGG 


TATCTGTAAT 


1380 


GTGAGACAAG 


ACATGAGACA 


ACATACCTGT 


CGTTGAAGTT 


TTTCCATGTG 


CTCCTGCTAC 


1440 


TCCCATGCTA 


ACAAAGTCAC 


GCATAAAGCT 


ACCTAGAAAC 


TCATGGTAAC 


GTTTGTAGCT 


1500 


GATACCATTT 


TGGTCCGCAT 


AGGCAATTTC 


GACGTTGTTA 


TCTGGACGAA 


AGGCATTTCC 


1560 


AGCGATAATT 


TCCATATCAC 


CGTCTAGATT 


TTTTTCATCA 


AAAGGAAGAA 


TGGTAATTCC 


1620 


TGCCTGCTCA 


AGACCGCGTT 


GGGTAAAGTA 


GTACTTTTCA 


ACATCTGATC 


CCTGAACCTT 


1680 


GTGCCCCATC 


TGGTGCAACA 


TCAAGGCCAA 


GGCACTCATC 


CCTGATCCCT 


TAATTCCGAT 


1740 


AAAATGATAT 


GTCTTTGACA 


TGTTTTCTCC 


CCTATTCTGT 


CATTCTGGTC 


AGATTCAACT 


1800 


CTTGGGCAAC 


CCGACGTTCT 


TGTTCTGTTT 


GTTTACTTTT 


TTTATTGTAG 


ATTTGGCTCT 


1860 


TCTTTAGAAA 


ATCATAATTG 


TTTTTCTTTG 


GAGCAGGTGC 


TGACACTTCT 


TCATTCTTGG 


1920 


TAGGGATAGA 


ATGAACTTCT 


TCCGCCAAGA 


TATAATGAGA 


CTGGGTCAAT 


TTTTGGCTAT 


1980 


ATTTGACAAA 


TTCACCAGGA 


TTTTCCTTTT 


GGAAAGGAGC 


TGTCGGTTGA 


TTGCCCTGTC 


2040 


TAACTAGACT 


GGGCTGAGAA 


TGACGTCTCG 


CAAGGCTGAA 


ATCCTGAGT? 


AGGTAGTTAG 


2100 


CAGAGCGTTT 


CTTTTTCAAG 


TCCGCACGCG 


CTTCTTCACG 


CGCCACCTCC 


GCATAGCTCT 


2160 


TTCCTTCTTT 


TTTAACCCCT 


AAAGGAGCCT 


TTTTAGGTTT 


TTCGACTTGC 


TTTTCAATCG 


2220 


GTTTTACTGG 


TTTTTCTTCA 


GCAATAGGAG 


CCCATTCTAA 


ATAATTTTTA 


TCTCGATACT 


2280 


CACCCTTGAT 


ATTACTGATC 


AGATCAGACT 


CATCATAGAG 


ATTCATGACT 


GGCATTTCAC 


2340 


TCAACATGAC 


CTCGTCATCT 


GACACCAATG 


GAAATCGTTC 


TTGTTTCATT 


TTCTATTTCC 


2400 


TTTCAACACT 


TCATTATAGC 


GTATTGTCTT 


GATTTTTCAA 


CTCCTGCCTT 


CAGAAATTCC 


2460 


CAAAATTTCT 


CTAATTTCTG 


CTAGGGTCAG 


ACTACCACGT 


GACTCTGTGC 


CGTCCAATAC 


2520 


TTGTGACACC 


AGATGTTTCT 


TTTGTTCTTG 


GAGTTCCTGA 


ATTTTTTCTT 


CAATGGTTCC 


2580 


CTTGGTCACC 


AAGCGATAGA 


CCTCAACCGT 


TTCTTCCTGA 


CCCATCCGAT 


GGGCACGGCC 


2640 


AATGGCTTGC 


GCTTCCACCG 


CAGGATTCCA 


CCAAAGGTCA 


ACCAAGATCA 


CTGTATCTGC 


2700 


ACCTGTCAGG 


TTCAGACCGA 


CCCCACCAGC 


CTTGAGGGAA 


ATCAGAAAGG 


CATCTCTTTC 


2760 


TCCTTGGTTA AAGGCCTTGG 


TCATGTCTTG 


TCTTTCCTTG 


GCTGGGGTTG 


AACCCGTAAT 


2820 


TTTAAAGGAA 


GTCAGGCCCA 


AGTCTGGCAG 


TTCTTGTTCA 


ATTTTTTCCA 


ACATTCCCTT 


2880 


GAACTGAGAG 


AAAATCAAGA 


CACGGTGTCC 


GCCGTCTGCC 


ACCTGTACCA 


GTAGGTCTCG 


2940 


GAGACTATCT 


AGTTTGCCGC 


TGGCTCCCTG 


ATAATCTTCC 


ATAAACAGGG 


CAGGAGTGTC 


3000 
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ACATATTTGA CGCAAGCGCA TCAAACCAGA TAAAATTTCC ACACGACTTC GCTGAAATTC 3060 

CTGTTCTGAC ACTTGAGCCA GATGGTCTCG CATCTGTTGT AACTGGGCAA GGTAAATAGC 3120 

CTTTTGCTGG TCTTCCAGTT CATTTTTATA AACCACCTCA ATCAAGTCTG GCAATTCAGT 3180 

CAGAACTTCT TCTTTCTTGC GTCGCATCAC GAAAGGCTTG ATAAACTGAG CCACTCGCTC 3240 

TGCTGGCAAT TTCATAAATT CTTTCTTGCT TGGCAAAAGT CCAGGCATGA CGATTTGGAA 3300 

AATAGACCAC AACTCACCCA GATGGTTTTC AATCGGAGTT CCTGACAAGG CAAAGACCGA 3360 

CGGCACCACA AATTGTCTCA AGGTCTGGGC AATCTTGGTC TGGGCATTTT TCATGACCTG 3420 

AGCCTCATCT AAGAAAAGGA AGTCAAAGGC CATCCCTTGA TAAAACTCAC TGTCCTGACG 3480 

GAAGGTGGCA TAGCTAGTCA CATAGATTTG ATGGCTCTCG GCAAGAATCT CCTCACGACT 3540 

TGCTTTCAAA CCATGAACAA CAGTCACATC CAACTGTGGA GCAAATTTCT GAAACTCATC 3600 

TGCCCAGTTG TAAATCAAAC CCGACGGAGC GAGAATCAAA ACCCGACTTT CTTTTGTCAC 3660 

TTGACTAGTC AAAAAAGCAA TGGTCTGAAG GGTTTTCCCA AGTCCCATAT CATCAGCCAA 3720 

AATCCCACCA AAACCATAAT GATGGAGCAT CTGCAACCAG CCAATTCCCT TTTCCTGATA 3780 

ATCTCGCAAG TCAGCCTTGA CCTGAGTTGC TTGCAAAGGA AAGTCCTCTG GATGCGTCAA 3840 

ATCCTGGGCC AGATTCTGGA ATTCTTGTGA AAAAGAAACA CGGTCTCGCC CTTCAAAGAG 3900 

ATGAGCTAAA CTCTAGGCCA AGGATTTCCG AGCCTGCAAG GTCCCATCTT TTAATTCAAA 3 960 

TTGCCCCAGT TCCTGTAGAT TTTGGCGAAT TTTCTTGGTT TCTTCATCGA AAAAGTAAAC 4020 

TTGATTAGAC GAATCAATAT AAAAATCCTG ATTGGCAACC AAGGCCTCCA TGGCTTGGTC 4080 

CATTTCCTCC TGGACAATAT TTTGAAAATC AAACTGGATT TCCAAGAGAC CTCCCTTGGA 4140 

GGCAATCTGC ACCTGAGGAC TCGCTAGGCT ATAAAGCTCT TCTAGTTTAT CTGATAGGTC 4 200 

AACATGCCCG AGTTTTTCAA AGACTGGAAT GATATCATGA AAAAAATGAT AGACAGACTC 4260 

CGCTTTTAAG GCCTGACGCC AAGATTGAAA ATCGGCCTCA AAGCCCGCAG CCAAACAGAC 4 320 

TTGGAAAATT CTTTCTTCTA AGTCTGCGTC ACTTGAAAAG GGTAATTCTT CTAGCTCTTG 4 380 

TCGCCTAGAT ACCTGTCTAT TTCCATAATC AAACTGAATT TCTAAACGAA TCCGATTATC 4440 

TTCTTCCCTG TCAAAGTAAA AAGAGGGCGC AAAAGTTTTG ATTTGTAGAC GTTCTGGACC 4500 

TGAAACGGTG CCCATCTGGA TAAAAAGAGT CAGACAGGAG GCCAATTTGT CTCGATCACT 4560 

CCTATCAAAT TGCAGGTATT TCTTTCCTTG TTGACCCACA CGTAACGCTT TAATTTCCTT 4 620 

GAGAAGACGC ATCTGCTGGT CTGTTAAAAA ATAAACCTGA CCTTTATGGA AAAGTACTGC 4680 

TCCCTGATAA AAGACATTGA CCCTAGGACT CTCACTGATT TCCATTTCAA AATAATCCGA 4740 

GTATTCTGTT ACTGTAAAGG CAAATAGATT GGCATCAGCA TGCATATCCT GAAAAAGCAG 4800 
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